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“Novel  Airborne  Video  Sensors; 

Super-Resolution  Multi-Camera  Panoramic  Imaginci  System  for  UAVs" 


Abstract 


Objective  and  purpose:  Application  of  a  camera  array  as  a  flexible,  reconfigurable, 
inexpensive  high-resolution  panoramic  motion-imagery  sensor  for  low-altitude 
reconnaissance  aircrafts  is  investigated. 

Methods  employed,  restrictions  and  limits:  Assuming  multiple-view  noisy  image  position 
measurements  of  terrain  features  and  known  camera  projection  matrices  by  calibration,  terrain 
feature  localization  and  UAV  positioning  are  analyzed  by  computer  simulations,  with/without 
supplementary  gyro  and  GPS. 


Results  and  conclusions:  How  various  system  parameters  impact  the  achievable 
precision  of  panoramic  system  in  3-D  terrain  feature  localization  and  UAV  motion  estimation 
is  determined  for  the  A=0.5-2  [km]  flight  altitude  range.  Enhancement  of  estimation  accuracy 
from  GPS  and  gyro  is  explored.  Estimation  error  variance  plots  are  given  as  a  function  of 
camera  resolutions,  viewing  angles,  flight  altitudes,  GPS  and  altitude  measurement  errors, 
number  of  views,  etc.  Selected  results,  from  point  correspondences  in  4[Kpix]x4[Kpix]  images 
and  utilizing  GPS  readings  with  one-meter  error  variance  at  0.5-2  [km]  altitudes,  comprise: 
Estimating  3-D  coordinates  of  ground  features  tracked  in  1-2  dozen  images  with  A/10 
baselines  at  sub-meter  accuracy;  Determining  UAV  pose  with  0.1 -0.3  [deg]  variance  by 
matching  2-3  dozens  of  features  in  two  views.  The  results  provide  valuable  guidelines  for  the 
integration  of  camera-array  images  into  one  super  resolution  panorama,  registering  multiple 
panoramas  to  construct  a  single  composite  view,  integration  of  visual  survo  with  onboard 
sensors,  map-based  navigation  and  AUV  positioning.  Computed  performance  charts  enable 
the  design  of  optimal  high-resolution  imaging  system  based  on  the  UAV  size  and  capability 
constraints. 
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Figure  13:  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in  terrain 
feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS  (Altitude  500);  See 
text  for  details. 

Figure  13:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 


Figure  13:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  13:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  13:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  13:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  14-  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in  terrain 
feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS  (Altitude  2000). 

Figure  14:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 

Figure  14:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  14:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  14:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 
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Figure  15:  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in  terrain 
feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS  (Altitude  4000). 

Figure  15:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 

Figure  15:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  15:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  15:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  15:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  16:  Closed-Form  Solution  with  Small-Rotation  Approximation-  Uncertainty  (reconstruction 
variance  [m])  in  terrain  feature  localization  by  tracking  15  and  30  (right)  points  with  noise-free  GPS 
(Altitude  500):  See  text  for  details. 

Figure  16:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 

Figure  16:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  16:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  16:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  16:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  17'  Closed-Form  Solution  with  Small-Rotation  Approximation-  Uncertainty  (reconstruction 
variance  [m])  in  terrain  feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free 
GPS  (Altitude  2000). 

Figure  17:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 


Figure  17:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  17:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  1 7:  (continued)-  Tracking  1 5  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  1 7:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  18:  Closed-Form  Solution  with  Small-Rotation  Approximation-  Uncertainty  (reconstruction 
variance  [m])  in  terrain  feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free 
GPS  (Altitude  4000). 

Figure  18:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 

Figure  18:  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  18:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  1  [m]. 

Figure  1 8:  (continued)-  Tracking  1 5  (left)  and  30  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  18:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance  of  2  [m]. 

Figure  19:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  of  3  sample  points 
in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1 ,3,4}) 
resolutions  (Altitude  500);  See  text  for  details. 

Figure  19:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 

Figure  20:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  of  3  sample  points 
in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1 ,3,4}) 
resolutions  (Altitude  2000);  See  text  for  details. 

Figure  20:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 

Figure  21:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  of  3  sample  points 
in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1 ,3,4}) 
resolutions  (Altitude  4000);  See  text  for  details. 

Figure  21:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 

Figure  22:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  in  terrain  feature 
localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1 ,3,4})  resolutions 
(Altitude  500);  See  text  for  details. 

Figure  22:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 

Figure  23:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  in  terrain  feature 
localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1 ,3,4})  resolutions 
(Altitude  2000);  See  text  for  details. 

Figure  23:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 


Figure  24:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  [m])  in  terrain  feature 
localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  LxL  (L={1,3,4})  resolutions 
(Altitude  4000);  See  text  for  details. 

Figure  24:  (continued)  -  Tracking  9  (left)  and  12  (right)  points. 

Figure  25:  Variances  of  UAV  pose  angles-  computed  from  rotation  with  respect  to  ref.  coordinate 
system  using  Rodriguez  formula-  for  various  altitudes,  GPS  measurement  uncertainties,  and 
number  of  terrain  feature  points  tracked  in  two  views.  See  text  for  details. 


Tables: 

Table  1 :  Variances  of  feature  positions  and  variations  with  resolution  and  GPS  uncertainty  for  3 
altitudes.  Some  discrepancies,  where  accuracy  is  slightly  better  for  a  lower  GPS  accuracy, 
to  the  random  selection  of  points  in  various  simulations.  See  text  for  more  details. 

Table  2:  Pose  angles  estimation  variances  by  tracking  known  terrain  targets  at  various  altitudes. 
See  text  for  more  details. 


1  Summary 

1.1  Problem 

Development  of  automatic  image  analysis  and  interpretation  systems  has  been  among 
major  activities  supported  by  DoD  (DARPA).  Video  surveillance  in  high-security  areas, 
potentially  followed  by  automated  face  identification  and  recognition,  abnormal  event  de¬ 
tection  and  mapping  are  only  a  few  applications  of  interest  that  directly  involve  the  ex¬ 
traction  of  accurate  information  from  images,  while  covering  as  large  of  a  scene  area  as 
possible.  Image  resolution  is  directly  tied,  first  to  the  dbility  to  extract  the  sought  after 
scene  information,  next  to  the  accuracy  of  this  information,  and  finally  to  the  actions 
that  would  be  triggered  based  on  the  acquired  knowledge.  One  scenario  is  to  locate  a 
person/object  in  an  image,  to  extract  facial/structural  features  from  one  or  more  images 
accurately  enough  to  recognize  the  person/target,  and  to  invoke  proper  action(s)  upon 
recognition  (identification  or  classification)  and  (or)  confidence  level  of  the  outcome.  Such 
capabilities  require  motion  video  sensors  that  would  generate  very  high-resolution  images 
of  the  entire  surrounding  environment. 

Increased  resolution  can  be  achieved  by  constructing  a  view,  ideally  a  panorama,  from 
the  images  of  a  camera  cluster  each  covering  a  smaller  section  of  the  entire  field  of  view  [9, 
10, 11,  27,  28,  29,  32,  33,  38].  Here,  the  processing  power  of  the  computer  is  exploited  to  our 
advantage  to  carry  out  the  necessary  calculations  for  the  alignment  of  the  images.  Existing 
powerful  PCs  can  achieve  a  seamless  alignment  of  several  standard  CCD-resolution  images- 
roughly  half  to  one  dozen-  in  real  time.  Furthermore,  the  views  can  be  generated  into  super 
resolution  imagery  [5];  much  higher  than  for  each  view,  though  this  requires  more  extensive 
computations.  Such  capabilities  will  improve  with  the  continuous  growth  in  processing 
power.  Furthermore,  images  from  nearby  positions  (with  sufficiently  large  baselines)  enable 
the  application  of  photo-mosaicing  technology  to  construct  rather  large  composite  views 

[31]. 

A  conical  panorama  can  be  generated  from  vertical  scans  as  an  oblique  down-look 
camera  rotates  about  a  vertical  axis,  either  through  or  at  some  distance  from  the  projection 
center  [9].  A  slanted  down-look  configuration  is  particularly  suitable  for  flyover  imagery, 
while  the  adjustment  of  the  oblique  viewing  angle  permits  the  control  of  the  coverage 
area  and  overlap  in  various  cameras.  One  goal  of  this  Phase-1  SBIR  project  is  to  analyze 
the  multi-camera  realization  of  a  conical  panoramic  imaging  system,  targeted  for  airborne 
employment  at  a  range  of  altitudes.  Due  to  flexibility  in  the  choice  of  certain  parameters, 
the  system  can  be  designed  to  optimize  various  performance  criteria,  including  the  coverage 
and  resolution  based  on  the  overlapping  regions  of  neighboring  cameras,  for  a  suitable 
altitude  range.  Another  aspect  of  the  work  is  to  explore  the  role  of  image  resolution  in 
video  servo  for  UAV  positioning  over  a  terrain,  say  during  a  surveillance  mission.  The 
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objective  is  to  establish  charts  of  performance  measures-  precisions  in  3-D  terrain  target 
localization  and  UAV  motion  estimation  and  positioning-  in  various  cased  and  operational 
ranges. 

To  meet  the  objectives,  a  number  of  cases  have  been  studied.  In  the  first  half  of  the 
project,  the  emphasis  was  placed  on  the  assessment  of  the  conical  panoramic  imaging 
system,  and  the  performance  in  terrain  reconstruction  and  motion  estimation  based  on 
the  images  from  a  camera  cluster.  In  the  second  half,  the  study  emphasized  more  gen¬ 
erally  the  role  of  image  resolution  on  performance,  independent  of  the  configuration  of 
the  cameras  in  constructing  the  image;  a  single  high-resolution  camera  with  a  large  field 
of  view,  or  a  camera  cluster  each  with  a  lower  resolution,  smaller  field  of  view,  and  in 
arbitrary  configurations.  Furthermore,  attention  was  given  to  the  fact  that  the  terrain 
often  appears  relatively  flat  in  flyover  imagery,  particularly  at  high  altitudes.  As  a  result, 
there  is  generally  an  inherent  translation-rotation  visual  motion  ambiguity  that  results  in 
some  level  of  inaccuracy  in  estimating  the  motion  components,  namely  discerning  certain 
translation  and  rotation  components.  To  resolve  this  issue,  the  use  of  onboard  odometry 
information,  GPS  measurements,  gyros,  etc.,  can  be  instrumental.  In  particular,  it  is  im¬ 
portant  to  know  how  the  use  of  gyro  and  GPS  measurements  may  improve  visual  servo. 
We  have  explored  the  accuracy  in  determining  the  3-D  coordinates  of  terrain  features 
from  multiple  views  along  the  path,  utilizing  GPS  measurements  to  establish  the  UAV’s 
absolute  positions.  Consequently,  we  can  establish  the  precision  in  estimating  the  UAV 
pose  by  tracking/matching  the  2-D  projections  of  located  landmark  features  at  different 
UAV  positions.  While  gyros  can  provide  an  estimate  of  the  UAV  orientation,  the  drift 
error  can  become  significant  in  long-duration  surveillance  operations.  Thus,  integration  of 
visual  servo  with  gyro  can  potentially  provide  a  mechanism  to  estimate  the  drift,  as  well 
as  enhance  the  total  performance. 

1.2  Results 

In  assessing  the  performance  of  the  multi-camera  panoramic  imaging  system,  we  have 
utilized  well-known  solution  for  3-D  scene  reconstruction  [17]  and  motion  estimation  [3, 
20,  21],  but  have  derived  mathematical  models  to  assess  the  accuracy  of  these  solutions 
in  terms  of  various  system  design  parameters.  We  have  carried  out  computer  simulations 
to  establish  variations  in  system  performance  with  these  parameters,  and  provided  charts 
of  performance  -vs-  system  parameters.  This  is  helpful  to  determine  the  pay-off  in  the 
number  and  resolution  of  individual  cameras. 

The  application  of  three  different  methods  has  provided  the  assessment  of  accuracy  in 
the  estimation  of  3-D  motion  and  terrain  target  positions  from  multiple  views.  Numerous 
computer  simulations  with  these  techniques  have  been  carried  out  while  varying  image 
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resolution,  UAV  altitude,  GPS  uncertainty,  number  of  views,  and  image  feature  localization 
uncertainty.  Charts  of  error  variance  have  been  generated  to  readily  analyze  achievable 
accuracy  under  various  cases,  and  to  establish  the  pay-off  in  increased  resolution  at  various 
altitudes. 

1.3  Conclusions 

For  the  conical  imaging  system,  increased  image  resolution  can  enhance  1)  3-D  localiza¬ 
tion  accuracy  of  the  terrain  targets,  2)  estimation  of  the  UAV’s  motion  by  utilizing  the 
correspondences  of  terrain  features  at  nearby  viewing;  3)  computation  of  the  UAV  position 
along  UAV’s  trajectory  for  registering  images  taken  at  multiple  views  to  construct  a  large 
composite  image.  Due  to  the  small  baseline  (separation)  of  cameras  within  the  imaging 
system  structure,  ability  of  depth  perception  from  the  visual  cues  in  multiple  cameras  is 
j^jYiited  to  very  low  flying  altitudes,  say  at  terrain  distances  of  less  than  'Zi  100  jrnj.  In 
this  range,  depth  perception  can  be  substantially  enhanced,  to  within  2-3  meters  uncer¬ 
tainty,  by  the  deployment  of  many  (order  of  1-2  dozens)  very  high-resolution  (e.g.,  QSXGA 
and  QUXGA  at  6-8  [MPix])  cameras.  In  contrast,  high  level  of  X  and  Y  accuracy-  say 
localization  uncertainty  of  roughly  a  decimeter-  is  feasible  by  the  deployment  of  half  a 
dozen  average  resolution  VGA  cameras.  Use  of  more  high-resolution  cameras  can  reduce 
the  uncertainty  down  to  a  couple  centimeters.  Slanted  viewing  angle  of  the  cameras  can 
provide  not  only  larger  coverage,  but  surprisingly  some  improvement  in  localization  ac¬ 
curacy  of  terrain  features  and  targets.  Motion  estimation  accuracy  is  affected  somewhat 
adversely  by  the  well-known  inherent  translation-rotation  ambiguity  of  visual  motion,  and 
the  uncertainty  in  the  estimation  of  depth  values  from  the  camera  cluster.  Integration 
with  high-precision  angle  sensors-  e.g.,  ring  laser  gyros  (RLG)-  can  substantially  improve 
the  performance,  to  the  point  where  the  translation  motion  uncertainty  is  only  limited  by 
the  gyro  inaccuracy. 

The  altitude  limitation  and  the  somewhat  inaccurate  depth  reconstruction  accuracy  is 
readily  overcome  by  the  use  of  multiple  images  from  relatively  large  baselines.  Each  image 
may  be  generated  with  the  multi-camera  panoramic  imaging  system,  or  alternatively  may 
be  acquired  with  a  single  high-resolution  camera  (distinction  is  immaterial  in  assessing  the 
accuracy  variation  with  resolution).  We  only  need  to  assume  the  image  resolution  within 
the  desired  field  of  view.  Furthermore,  use  of  GPS  measurements  lends  several  advantages. 
First,  GPS  readings  with  baselines  of  100  [m]  or  more  between  any  two  UAV  views  provides 
a  more  accurate  estimate  of  the  UAV  absolute  position  than  visual  motion  techniques,  and 
furthermore  is  drift  free.  Incorporating  this  information  enables  us  to  establish  the  UAV 
pose  along  its  trajectory  from  visual  cues.  Finally,  integration  with  gyro  measurements 
offers  a  mechanism  to  estimate  and  rectify  gyro  drift  in  long-duration  operations,  and  to 
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compute  pose  more  accurately  by  data  fusion. 

To  provide  some  numbers,  with  multiple  images  at  1)  4[Kpix]x4[Kpix]  resolution;  2) 
extending  over  roughly  90  [deg]  in  field  of  view;  3)  GPS  error  variance  of  ggps  =  1  [m],  and 
altitude  measurement  variance  of  IOctgfs  Hi  achieve  sub-meter  terrain  feature 

localization  accuracy  in  each  coordinate:  An  error  variance  as  low  as  0. 1-0.9  [m]  by  tracking 
a  few  dozen  features  over  a  couple  (or  more)  dozen  views  with  a  baseline  1/5-1/10  of  the 
altitude  at  500-4000  [mj.  We  can  also  determine  the  UAV  orientation  with  error  variances 
of  less  than  0.15-0.25  [deg]  in  pitch  and  roll  and  0.01-0.05  [deg]  in  heading  at  500-4000  [m] 
altitudes,  by  tracking  two  dozens  of  features  with  GPS  error  variance  of  acps  =  1  [m]  and 
UAV  altitude  uncertainty  of  Ibacps  M- 

GPS  and  altitude  uncertainties  both  play  a  critical  role  in  the  accuracy  of  the  UAV 
pitch  and  roll  motions,  but  have  much  less  impact  on  determining  the  heading.  Both 
measurement  errors  are  less  significant  at  higher  altitudes  (as  a  percentage),  thus  leading 
to  better  localization  and  pose  estimation  accuracies.  The  performance  can  be  improved 
by  increasing  the  field  of  view,  reducing  the  UAV  altitude  measurement  error,  in  addition 
to  algorithmic  adjustments  and  optimizations  that  have  not  be  considered  in  this  study. 

Before  moving  forward,  we  make  a  terminology  clarification.  When  talking  about 
camera  or  robot  “pose”  in  the  vision  literature,  one  often  refers  to  both  the  position  and 
orientation.  The  work  in  the  second  half  of  this  study  primarily  deals  with  the  estimation 
of  the  UAV  orientation  from  visual  cues,  when  GPS  estimates  are  used  to  establish  the 
UAV  absolute  position.  Thus,  “pose  estimation,”  “estimated  pose,”  “variance  of  pose  pa¬ 
rameters,”  etc.,  primarily  refers  to  the  UAV  orientation.  Unless  it  clearly  means  otherwise 
from  the  context,  UAV  pose  means  the  orientation  at  some  position  along  its  path. 
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2  Introduction 


An  image  is  a  two-dimensional  (2-D)  view  of  a  three-dimensional  (3-D)  world  from  a 
specific  viewpoint.  An  image  acquired  hy  a  traditional  camera  has  limited  resolution  and 
field  of  view.  An  evolving  imaging  technology  is  based  on  the  emergence  of  a  new  paradigm, 
the  so-called  “computational  camera”.  This  refers  to  a  digital  device  that  embodies  the 
unification  of  imaging  sensors  and  the  computer.  Based  on  projection  geometry  and  image 
remapping,  various  capabilities  become  feasible  that  are  either  impossible,  too  complex, 
and  (or)  very  costly  to  achieve  with  traditional  cameras  and  lenses.  Examples  include 
the  computer  processing  of  typical  images  (projections  onto  planes)  to  construct  views 
that  correspond  to  1)  any  arbitrary  camera  view,  or  2)  projections  onto  arbitrary  image 
surfaces,  say  spheres,  cylinders,  and  cones.  Significant  technological  advantages  are  being 
made  possible  by  exploiting  computing  power  that  is  growing  at  a  tremendous  rate  and 
diminishing  cost.  This  enables  carrying  out  more  and  more  complex  operations  that  result 
in  higher  and  higher  image  quality.  However,  one  still  needs  devices  that  can  provide  views 
of  the  entire,  or  some  relevant  large  region  of  the,  scene.  As  depicted  in  fig.  1,  a  cluster  of  N 
cameras  properly  positions  relative  to  each  other-  each  covering  some  region  of  the  scene- 
can  provide  a  panoramic  coverage.  The  alignment  of  the  images  to  generate  the  panorama 
is  performed  based  on  well-known  theories  in  camera  calibration  and  image  registration 
[18].  The  computational  speed  of  current  powerful  PCs  can  enable  the  registration  of 
several  (order  of  half  to  one  dozen)  images  at  video  rate,  enabling  the  construction  of 
panoramic  imagery  as  quickly  as  the  camera  images  are  recorded. 

The  number  and  resolution  of  each  individual  camera  control  the  resolution  of  the 
constructed  panorama.  Clearly,  the  highest  quality  image  is  expected  when  utilizing  a 
larger  and  larger  number  of  cameras  with  the  highest  possible  resolution.  However,  this 
has  to  be  traded  off  against  the  requirements  for  data  storage,  processing,  transmission, 
etc.  The  goal  of  this  study  is  to  assess  the  achievable  accuracy  and  resolution  in  terms  of 
the  total  number,  resolution,  and  configuration  of  individual  cameras.  The  accuracy  and 
resolution  are  tightly  related  and  their  relevance  is  tied  to  the  application.  In  the  first  half 
of  our  work,  we  explored  the  impact  of  a  number  of  design  parameters  of  the  panoramic 
imaging  system  on  terrain  feature  localization. 

In  this  second  half  of  our  work,  we  have  been  primarily  concerned  with  the  accuracy  in 
locating  certain  ground  landmark  targets,  and  consequently  tracking  them  to  determine  the 
aircraft’s  pose  from  recorded  images  and  GPS  readings.  Unlike  the  first  half  of  the  project, 
where  the  investigation  also  addressed  the  impact  of  camera  cluster  configuration  within 
the  panoramic  system,  the  emphasis  here  is  on  the  image  resolution,  UAV  altitude  and 
GPS  measurement  uncertainty.  In  particular,  we  have  explored  two  operational  scenarios. 

1.  The  UAV  is  circling  over  some  terrain,  say  a  metropolitan  area,  in  a  long  surveillance 
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operation.  It  has  no  prior  knowledge  of  the  terrain.  By  locating  and  tracking  certain 
fixed  targets  over  several  images  and  utilizing  GPS  readings,  it  computes  its  position 
and  pose,  and  more  importantly  the  3-D  positions  of  the  tracked  targets. 

2.  Based  on  knowledge  of  some  fixed  ground  landmarks-  e.g.,  determined  by  the  method 
in  the  first  scenario,  the  aircraft  establishes  its  position  and  (or)  trajectory.  While 
GPS  gives  the  position  and  gyros  provide  pose  information,  gyro  measurements  do 
drift  over  an  extended  time.  Thus  we  seek  the  utilization  of  a  drift-insensitive  vision- 
based  approach,  which  may  simultaneously  allow  the  estimation  and  (or)  correction 
of  gyro  drift,  as  well  as  an  effective  estimation  method  potentially  based  on  the 
integration  of  gyro  and  visual  servo  solutions. 

The  information  determined  above  can  feed  into  the  system  for  building  a  super  reso¬ 
lution  map/mosaic  of  the  terrain,  and/or  to  track  certain  moving  targets  of  interest. 

3  Methods  and  Assumptions 

3.1  Panoramic  Views 

Reconstruction  of  a  terrain  map  from  multiple  images  acquired  on  an  airborne  system  has 
been  known  to  photogrammetry  engineers  since  the  early  1900’s  [4,  8,  30].  Over  the  last 
two  to  three  decades,  the  same  mathematical  models  have  been  deployed  extensively  in  the 
computer  vision  community  for  numerous  applications,  including  the  automated  operation 
of  robotics  systems,  mobile  platforms  and  mapping  systems.  In  analyzing  the  panoramic 
imaging  system,  we  have  utilized  well-known  solutions  for  3-D  scene  reconstruction  [17] 
and  motion  estimation  [3,  20,  21],  however,  we  have  incorporated  new  mathematical  mod¬ 
els  that  we  have  derived  to  assess  the  sensitivity  to  measurement  noises  for  various  system 
parameters.  These  models  enable  us  to  assess  the  accuracy  in  localizing  a  terrain  fea¬ 
ture  based  on  multiple  observations  in  the  camera  cluster  as  a  function  of  various  system 
parameters-  error  in  determining  the  image  position  of  a  terrain  feature,  number  and 
resolution  of  the  cameras,  viewing  angle,  altitude,  and  area  coverage.  (Appendix  1  gives 
the  detailed  theoretical  foundation  of  the  results  given  here.)  In  particular,  determination 
of  feature  positions  as  image  measurements  is  tied  to  the  quantization  level  in  a  digi¬ 
tized  image,;  the  higher  the  resolution,  the  more  accurate  the  feature  position.  Charts 
of  accuracy-vs-  imaging  system  parameters  have  been  generated  to  readily  analyze  and 
assess  the  performance  of  a  multi-camera  panoramic  imaging  system,  and  to  establish  the 
pay-off  in  the  number  and  resolution  of  individual  cameras,  their  arrangement  and  viewing 
angles,  etc. 
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In  our  simulations,  the  primary  assumption  is  that  each  camera  satisfies  the  pinhole 
camera  model  [17].  In  practice,  this  requires  the  calibration  of  each  camera  1)  to  determine 
its  internal  parameters,  including  the  focal  length,  aspect  ratio  (horizontal  to  vertical 
scaling  ratio)  and  the  actual  location  of  the  image  center  (where  optical  axis  pierces  the 
image  plane);  2)  rectifying  each  image  to  remove  lens  distortions  [18].  Furthermore,  we 
assume  perfect  knowledge  of  the  relative  positions  and  orientations  of  the  cameras  in  the 
coordinate  frame  of  the  panoramic  imaging  system.  In  practice,  this  is  determined  by  a 
variety  of  external  calibration  methods  [35,  37].  Typical  achievable  accuracy,  measured 
in  terms  of  the  average  misalignment  error  of  features  used  in  calibration,  is  some  small 
fraction  of  a  pixel  size,  which  is  sufficient  for  most  applications. 


3.2  3-D  Reconstruction  from  Multiple  UAV  Positions 

Application  of  three  different  methods-  detailed  in  Appendix  2-  has  provided  the  assess¬ 
ment  of  accuracy  in  the  estimation  of  3-D  motion  and  terrain  target  positions  from  multiple 
views.  The  underlying  constraint  is  the  rigid  body  motion  constraint 


=  RijP i  -b  t 


V 


where  Pi  denotes  the  coordinates  of  a  terrain  point  in  coordinate  system  of  the  camera 
at  position  i,  and  {tij,Rij}  are  3-D  vector  and  3x3  rotation  matrix  that  describe  the 
displacement  and  angular  motion  of  the  aircraft  from  position  ito  j.  It  is  well-known  that 
estimation  of  both  translational  and  rotation  motions  from  image  projections  of  features 
on  a  relatively  flat  scene^  can  be  an  ill-conditioned  problem  [1].  The  use  of  GPS  pro¬ 
vides  several  advantages.  First  GPS  and  accurate  altitude  measurements  with  baselines  of 
100  [m]  or  more  between  two  UAV  positions  generally  provide  better  estimates  of  the  UAV 
absolute  position  than  visual  motion  techniques,  while  being  drift  free.  This  simplifies  the 
problem  to  estimating  the  UAV  pose  from  visual  cues.  Furthermore,  integration  of  this 
information  with  gyro  measurements  offers  a  mechanism  to  estimate  and  rectify  g3rro  drift 
in  long-duration  operations,  and  to  compute  pose  more  accurately  through  data  fusion. 
Therefore,  we  determine  displacement  vector  Uj  between  two  UAV  positions  T,  and  Tj, 
from  the  combination  of  GPS  and  altitude  measurements  up  to  some  varying  levels  of 
accuracy: 

tij  =  Tj  —  Ti- 

We  will  discuss  the  impact  of  GPS  and  altitude  measurement  errors  as  factors  influencing 
the  system  performance. 

our  case,  the  terrain  appears  flat  when  viewed  from  the  aircraft  at  high  altitude,  since  height 
variations  due  to  building  and  other  structures  are  typically  negligible  relative  to  the  aircraft  altitude. 
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A  large  number  of  simulations  have  been  carried  out  for  each  of  the  two  scenarios 
described  in  the  introduction  section,  while  varying  a  number  of  parameters: 

•  Image  resolutions  of  L  [Kpix]  x  L  [Kpix]  (L  =  {1,3,4}); 

•  Flight  altitude  A  [m]  {A  =  {500,2000,4000}); 

•  Number  of  views  M  {M  =  {2  :  21}; 

•  Number  of  terrain  features  N  (typically  N  =  {15,30,45,60},  though  other  cases 
have  also  been  tested  for  certain  algorithms); 

•  GPS  X  and  Y  position  variances  of  aops  =  {0>  1) 2}  [nr]; 

•  Altitude  measurement  error  with  variances  of  IOitgps  (in  terrain  feature  localization) 
and  1  Scraps'  in  posG  estimation  5 

•  Pixel  localization  error  with  variance  (Xj  (cr/  =  {1/3, 1}). 

We  have  assumed  a  lower  altitude  measurement  error  variance  of  lOaops  M  in  feature 
localization  experiments  than  15aaps  [m]  for  pose  estimation.  The  motivation  here  is  that 
generally,  we  want  to  establish  the  locations  of  terrain  features  with  high  accuracy  as  they 
are  used  in  subsequent  operations  for  UAV  positioning.  Therefore,  we  may  use  sensors 
with  higher  precision  in  the  first  scenario,  and  a  more  typical  sensor  in  the  later  operation. 

We  have  assumed  a  visual  field  of  view  (FOV)  of  roughly  90  [deg]  in  both  horizontal 
and  vertical  directions,  in  all  experiments.  This  gives  us  a  relatively  large  FOV  which 
is  desired  for  robust  motion  estimation.  The  various  resolutions  within  this  fixed  FOV 
may  be  achieved  with  one  or  more  cameras  As  one  scenario,  the  three  resolutions  we 
will  compare  can  be  realized  with  a  single  one-Mpix  camera,  and  3x3  and  4x4  arrays  of 
cameras,  each  with  one  Mpix  resolution  but  a  smaller  FOV.  The  FOV  was  purposely  fixed 
in  all  the  cases,  so  we  can  establish  merely  the  impact  of  image  resolution.  In  particular, 
since  many  studies  point  to  improved  motion  estimation  accuracy  with  larger  FOVs,  we 
did  not  want  this  fact  to  corrupt  our  conclusions. 

The  three  assumed  altitudes  are  treated  as  low  (500  [m])  and  intermediate  (2000- 
4000  [m])  ranges.  As  we  will  note  from  the  results,  error  variances  for  higher  image 
resolutions  converge  at  intermediate  altitudes,  enabling  us  to  infer  the  behavior  at  higher 
altitudes.  For  GPS  readings,  the  error-free  case  study  is  useful  to  establish  the  estimation 

20ne  consideration  may  be  the  trade-off  in  the  cost  of  a  high-quality  lens  with  small  distortion  for 
a  wide-angle  FOV  (as  used  in  photogrammetry  applications)  in  comparison  to  lower  quality  inexpensive 
lenses  and  complexity  of  1)  lens  distortion  correction  with  a  wide-angle  FOV,  versus  2)  intrinsic/external 
calibration  of  2-3  cameras  with  smaller  FOVs  and  lower  distortions. 
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error  due  solely  to  image  feature  position  inaccuracies  The  range  of  M  {2  .  21} 
views  provide  a  fairly  conclusive  assessment  of  the  impact  on  3-D  target  reconstruction 
accuracy  in  the  first  scenario,  which  would  subsequently  be  utilized  for  UAV  localization 
in  the  second  scenario. 

The  feature  position  uncertainty  of  a/  =  1/3  [pix]  is  justified  by  the  fact  that  many 
feature  detection  and  localization  methods  perform  at  sub-pixel  accuracy  [14,  7,  5].  Specif¬ 
ically,  our  assumption  means  that  the  N  selected  features  have  been  localized  at  no  worse 
than  3cr/  =  1  [pix]  uncertainty.  One  can  conservatively  identify,  match  and  track  two  or¬ 
ders  of  magnitude  or  more  features  than  a  few  dozens  (15-60,  used  in  our  study)  with  the 
assumed  accuracy  level,  given  the  abundance  of  features  in  1-16  Mpix  images  of  textured 
ground  scenes.  The  point  selection  can  be  readily  accomplished  by  the  application  of  some 
robust  statistics  including  RANSAC-based  algorithm  [13].  Therefore,  one  interpretation 
of  our  assumption  is  that  the  matched  features  have  previously  passed  a  threshold  test 
of  no  worse  than  one-pixel  localization  error  by  a  RANSAC-based  method^.  In  one  case 
though,  for  the  lower  altitude  of  500  [m],  the  results  have  been  derived  for  af  =  I  [pix],  as 
a  more  severe  test  of  performance. 

The  coordinate  frame  of  the  imaging  system  at  one  view  has  been  chosen  as  the  refer¬ 
ence  system.  In  the  simulations,  the  UAV  positions  were  chosen  over  the  terrain  area  that 
is  seen  by  the  cameras  at  the  reference  view.  This  corresponds  to  an  area  roughly  2A  x  2A 
in  the  XY  plane  [A  is  the  altitude).  The  baseline  is  of  the  order  of  (1/5  -  1/10)  *  A  in 
simulations  involving  a  smaller  number  of  views.  Some  baselines  between  pairs  of  UAV 
positions  become  smaller  as  the  number  of  views  increase,  say  to  a  dozen  or  more.  Fig.  4 
shows  selected  sample  examples  from  our  simulations,  where  the  (red)  circles  show  the 
terrain  feature  points,  and  the  (blue)  crosses  are  the  UAV  positions.  We  have  assumed 
a  frame-to-frame  rotational  motion  in  the  order  of  1-2  [deg]  when  testing  the  small-angle 
approximation  method,  and  a  few  degrees  for  all  other  cases. 

In  the  first  scenario  described  in  the  introduction  section,  we  have  tested  three  algo¬ 
rithms  described  in  Appendix  2j  classical  eight-point  algorithm  [24],  nonlinear  optimization 
method,  and  closed-form  solution  for  small  rotation  angles.  In  the  second  scenario-  de¬ 
termining  UAV  pose  by  tracking  terrain  targets-  the  nonlinear  method  of  section  2.5  in 
Appendix  2  has  been  applied.  Given  the  very  large  number  of  simulations  carried  out, 
it  is  difficult  to  establish  the  best  way  to  arrange  the  graphs  involving  the  variations  in 
numerous  parameters.  As  an  example,  the  results  from  the  eight-point  algorithm,  for  the 

3ln  the  remainder,  unless  implied  or  stated  otherwise.  GPS  readings  refer  to  measurements  of  all  three 
components  of  UAV  position,  thus  including  the  altitude  measurements.  However,  reader  should  note  that 
we  have  assumed  a  different  level  of  uncertainty  for  the  two  measurements;  see  above. 

■^This  is  a  reasonable  assumption,  given  the  size  of  our  images.  Also,  we  have  computed  the  feature 
noises  from  a  Gaussian  distribution  randomly,  and  thus  some  pixels  may  have  a  higher  noise  than  3cr/  = 
1  [pix]. 
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altitude  of  500  [m]  while  varying  other  parameters,  have  been  tabulated  in  6  pages  each 
with  120  plots.  We  have  presented  the  figures  for  various  simulations  in  Appendices  3(a-f), 
and  have  given  a  summary  and  general  conclusions  in  the  main  body  of  the  report. 

As  indicated  above,  the  terrain  points  have  been  selected  randomly.  Each  simulation 
was  repeated  300  times  with  different  noise  samples-  for  errors  in  image  positions,  GPS 
readings,  and  imprecise  knowledge  of  terrain  feature  positions  in  the  second  scenario- 
in  order  to  determine  the  estimation  statistics,  namely  the  variance  as  the  uncertainty 
measure.  In  some  cases,  the  camera  positions  and  viewing  angles  are  not  ideal  for  certain 
points,  resulting  in  poor  estimates  of  their  positions.  In  practice,  such  situations  can  be 
readily  identified  by  examining  the  reprojection  errors.  However,  the  scope  of  this  work 
has  not  been  to  develop  a  suitable  implementation  of  some  3-D  reconstruction  algorithm, 
but  rather  the  investigation  of  performance.  A  a  result,  we  have  selected  to  eliminate 
as  outliers  25%  (out  of  N  =  {30,45,60})  of  the  points  with  the  highest  reconstruction 
variances.  (This  means  that  the  variance  plots  show  the  performance  for  N'  =  3N/A 
points. 

For  a  sample  experiment,  fig.  4  depicts  the  projection  onto  the  XY  plane  of  the  ran¬ 
domly  selected  terrain  features  (red  circle)  and  the  camera  positions  (black  x).  We  have 
not  shown  the  Z  coordinates  of  these  features,  (heights  above  ground  reference  plane), 
which  have  been  generated  with  a  mean  and  variance  of  8  [m].  The  UAV  altitudes  have 
been  varied,  from  nominal  {500, 2000, 4000}  [m]  values,  by  about  10-20  [m]  in  experiments 
with  smaller  number  of  views,  and  as  large  as  50-75  [m]  when  approaching  the  maximum 
number  of  views  (say,  15-21  views).  In  these  plots,  certain  terrain  features  are  marked  with 
red  dots,  and  others  with  a  blue  rectangle.  These,  respectively,  correspond  to  1)  points 
with  reconstruction  accuracy  above  3  times  the  median  3*med,  and  2)  N  —  N'  points  that 
have  been  discarded  from  our  results.  This  example  shows  that  with  smaller  number  of 
views,  the  discarded  points  have  variances  larger  than  3*med,  while  some  discarded  points 
have  a  variance  less  that  3  *  med  as  the  number  of  views  increase.  Also,  these  points  are 
typically  clustered  near  the  region  boundaries,  where  there  is  little  variations  in  viewing 
direction  from  several  AUV  positions. 

4  Results 

4.1  Panoramic  Imaging  System 

The  images  in  fig.  1  depict  the  terrain  size  covered  by  various  number  of--  from  a  total 
of  N-  cameras  in  the  panoramic  imaging  system,  at  altitudes  of  A  =  100  [m]  (top-right) , 
A  =  500  [m] (bottom-left)  and  A  =  2000  [m]  (bottom-right).  In  these  images,  N  =  12 
cameras  have  been  assumed,  each  with  a  resolution  of  1024x768,  roughly  70  [deg]  in  FOV, 
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and  configured  at  a  viewing  angle  of  0  =  75  [deg]  (measured  from  down-look  position). 
The  significance  of  these  particular  plots  is  that,  by  determining  areas  of  the  terrain  that 
are  imaged  by  a  relatively  large  number  of  cameras,  we  can  establish  the  terrain  region 
size  that  may  be  imaged  at  the  highest  resolution  is  a  super-resolution  panorama  (to  be 
performed  in  the  second-half  of  the  project). 

In  most  applications,  we  are  primarily  concerned  with  the  central  region  of  the  image 
which  maps  the  area  directly  below  the  UAV.  We  want  to  view  a  larger  portion  of  this 
region  with  as  many  of  the  N  cameras  as  possible.  In  these  plots,  we  see  that  when  the 
UAV  flies  at  100  [m],  a  central  area  of  30  [m]  x30  [m]  is  images  by  all  12  cameras.  This  area 
increases  to  roughly  150x150  [sqm]  for  an  altitude  of  500  [m],  and  over  500x500  [sqm] 
for  an  altitude  of  2000  [m].  Unfortunately,  with  increased  elevation,  each  image  pixel 
corresponds  to  a  larger  local  region  of  the  terrain,  leading  to  lower  resolution.  Depicted 
in  fig.  3  are  sample  1024x768  images  from  altitudes  100  [m],  200  [m]  and  500  [m].  A 
measure  of  relative  resolution  is  the  image  size  of  a  particular  feature  in  these  images. 
For  any  UAV  altitude  A,  N  such  images  from  the  cameras  that  view  different  regions  of 
the  terrain  can  be  fused  to  generate  a  super-resolution  panorama.  Roughly  speaking,  4 
cameras  at  altitude  A  can  readily  generate  a  super-resolution  image  equivalent  to  a  single 
image  at  half  the  altitude.  With  respect  to  our  example,  a  super  resolution  comparable  to 
the  image  at  100  [m]  can  be  generated  firom  4  images  at  200  [m],  or  from  a  5x5  camera 
array  at  500  [m].  Clearly,  performance  can  be  improved  substantially  if  we  start  with  very 
high-resolution  cameras,  say  each  at  6-8  mega  pixels. 

4.1.1  3-D  Localization  Accuracy: 

A  number  of  sample  experimental  results,  given  in  Appendix  3a,  demonstrate  the  appli¬ 
cation  of  our  analytical  results  is  assessing  the  performance  of  a  multi-camera  imaging 
system  for  target  localization  and  mapping.  Referring  to  fig.  4in  Appendix  3a,  the  top 
row  shows  the  variance  (uncertainty  measure)  in  determining  the  X  position  of  a  terrain 
feature,  detected  in  various  views  of  the  panoramic  imaging  system.  The  assumed  uncer¬ 
tainty  arises  fi^om  the  errors  in  the  image  positions  of  a  feature  due  to  image  quantization. 
The  X  and  Y  position  uncertainties  are  the  same,  and  thus  only  X  direction  results  are 
given.  In  these  plots,  we  are  comparing  the  accuracy  of  a  system  with  W  =  6  :  16  cam¬ 
eras,  when  flying  at  three  different  altitudes  of  100  [m],  500  [m]  and  2000  [m].  Each  curve 
corresponds  to  a  certain  number  of  cameras.  The  cameras  are  arranged  symmetrically 
on  a  circular  ring  of  radius  d  =  0.25  [m],  at  a  slant  angle  of  75  [deg];  refer  to  fig.  1  for 
system  structure.  We  have  shown  in  the  left  column  the  uncertainty  measures  when  the 
image  resolution  varies  from  640  x  480  (for  a  typical  VGA  camera)  to  3200  x  2400  (for  a 
very  high-resolution  QUXGA  camera).  As  an  example,  consider  the  top-left  figure  for  the 
100  [m]  altitude.  We  can  determine  the  X  coordinate  of  a  terrain  feature  with  a  variance  of 
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less  than  10  [cm]  when  viewed  by  six  VGA  cameras.  The  uncertainty  drops  to  about  6  [cm] 
when  16  cameras  are  used,  and  to  about  2  [cm]  with  6  high-resolution  QUXGA  cameras. 
The  corresponding  results  for  a  higher  500  [m]  altitude  show  larger  uncertainties,  roughly 
0.47  [m],  0.3  [m]  and  0.1  [m]  respectively.  For  an  altitude  of  2000  [m],  we  can  achieve  an 
accuracy  of  roughly  1.2  [m]  with  16  typical  VGA  cameras,  while  6  high-resolution  QUXGA 
cameras  would  give  a  one-sigma  position  uncertainty  of  0.4  [m]. 

The  plots  on  the  right  column  show  little/no  advantage  in  increasing  the  camera  sep¬ 
arations  within  a  larger  imaging  system  structure.  From  these  plots,  we  conclude:  1)  We 
can  generally  establish  the  X  and  Y  positions  of  the  terrain  feature  with  relatively  high 
accuracy;  2)  We  can  also  determine  how  the  precision  varies  with  various  parameters. 

As  depicted  in  fig.  5,the  story  is  quite  different  for  position  along  the  Z  direction,  which 
is  tied  directly  to  the  ability  to  determine  the  vertical  size  (or  height  above  ground)  of 
terrain  features.  At  a  flying  altitude  of  100  [m],  for  example,  the  estimation  uncertainly 
is  as  large  as  38  [m]  when  utilizing  6  VGA  cameras.  This  drops  to  about  27  [m]  and 
23  [m]  with  12  and  16  VGA  cameras,  respectively.  Taking  advantage  high-resolution 
digital  cameras  (e.g.,  4-8  [Mpix]),  the  performance  increases  significantly,  approaching  an 
uncertainty  of  10  [m]  or  better.  For  8  [MPix]  QUXGA  cameras,  the  uncertainty  is  within 
6  [m].  The  variances  are  meaningless  for  the  altitudes  of  500  [m]  and  2000  [m]  (most  values 
are  larger  than  the  flight  altitude).  This  indicates  no  ability  for  3-D  depth  estimation  with 
the  multiple  images  of  the  camera  cluster.  This  very  inferior  performance  (compared  to  X 
and  Y  uncertainty)  is  not  surprising,  and  is  directly  tied  to  the  small/negligible  baseline 
of  the  imaging  system  (distance  between  nearby  cameras).  The  performance  plots  in 
the  right  column,  with  12  cameras,  show  lower  uncertainties  by  increasing  the  imaging 
system  structure  radius  d  to  provide  a  larger  separation  between  the  cameras,  refer  to 
fig.  1.  For  example,  the  uncertainty  drops  from  27  [m]  to  roughly  15  [m]  or  less,  if  the 
radius  d  of  the  12-camera  imaging  system  is  increased  from  0.25  [m]  to  0.5  [m]  or  larger. 
Another  important  conclusion  of  these  results  is  that  the  percentage  improvement  is  less 
when  high-resolution  cameras  are  used  (e.g.,  roughly  from  6  [m]  to  4  [m]  with  12  QUXGA 
cameras). 

How  large  the  structure’s  radius  can  be  made  is  tied  to  the  UAV  size  and  payload 
capabilities.  For  example,  smaller  UAVs  would  require  a  more  compact  low- weight  camera 
cluster  assembly,  say  one  with  a  radius  in  the  order  of  0.2  [m]  or  less.  Mid-size  UAVs  can 
carry  larger  units,  say  0.2  [m]  to  0.35  [m]  radii,  while  the  size  and  weight  constraints  are 
less  limiting  on  a  larger  UAV.  Therefore,  one  direct  application  of  this  graph  is  that  for  a 
given  UAV  platform,  we  can  readily  determine  how  many  cameras  and  what  resolutions  are 
necessary  to  achieve  a  desired  level  of  accuracy  in  3-D  depth  perception.  These  results  sug¬ 
gest  that  acceptable  accuracy  in  depth  perception  is  generally  restricted  to  low  altitudes. 
For  3-D  estimation  and  reconstruction  from  high  altitudes,  we  need  to  utilize  methods 
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based  on  multiple  view  geometry,  well-known  in  the  photogrammetry  and  structure  from 
motion/stereo  literatures  [4,  8,  17,  30]. 

An  interesting  result  is  the  slight  variation  in  accuracy  by  changing  the  camera  viewing 
angle  6]  refer  to  fig.  1.  Shown  in  fig.  6(top)  is  the  localization  uncertainty,  while  varying 
the  number  of  cameras  at  1024x768  resolution.  On  the  bottom,  we  have  fixed  to  number 
of  cameras  to  12,  but  have  changed  the  resolution.  For  example,  the  top  plot  shows  that 
the  27  (mj  uncertainty  with  6  cameras  at  a  slant  angle  of  75  [deg]  can  be  lowered  to  about 
19  [m],  if  the  viewing  angle  is  adjusted  to  60  [deg].  We  also  conclude  from  these  plots 
that  the  choice  of  the  viewing  angle  is  more  critical  when  using  a  smaller  number  and  (or) 
lower  resolution  cameras. 

4.1.2  3-D  Motion  Estimation  from  3-D  Measurements: 

Given  the  estimated  3-D  positions  of  terrain  targets  from  two  view  points,  we  can  deploy 
the  solution  to  the  absolute  orientation  problem  from  photogrammetry  to  determine  the 
motion  of  an  imaging  system  from  one  viewpoint  to  the  next.  The  problem  has  a  closed- 
form  solution,  as  given  in  section  1.2  of  Appendix  1  [3,  20,  21].  A  number  of  problems  can 
be  addressed  based  on  this  solution,  including  the  construction  of  2-D  photo-mosaics  and  3- 
D  terrain  reconstruction  from  panoramas  at  multiple  views.  We  are  interested  to  establish 
how  accurately  the  3-D  information  can  be  determined  from  the  images  of  the  multi-camera 
panoramic  system,  and  how  the  precision  compares  with  alternative  methods.  Based  on 
the  earlier  results-  particularly  the  high  inaricuracy  in  determining  the  Z  component  of  a 
terrain  feature  position  from  higher  altitudes-  the  following  results  are  determined  for  only 
the  low  altitude  of  100  [m].  We  have  utilized  a  10%  density  of  image  points  (i.e.,  one  out 
of  every  10  pixel)  for  3-D  motion  computations  in  all  of  the  results  given  in  Appendix  3b, 
but  one  case  where  we  analyze  the  impact  of  the  number  of  terrain  features.  Furthermore, 
these  uncertainties  result  from  the  errors  in  image  feature  positions,  and  thus  hold  for 
any  motion  size.  Therefore,  the  larger  the  motion  size,  the  larger  the  SNR  of  the  motion 
estimation  process. 

Fig.  7depicts  the  uncertainty  in  the  six  translational  and  rotational  components  of 
the  3-D  motion  as  the  image  resolution  of  each  camera  is  varied.  Each  plot  gives  the 
variation  of  these  results  with  the  number  of  cameras.  One  immediate  conclusion  is  that 
camera  resolution,  more  than  the  number  of  cameras,  is  a  critical  factor.  At  highest  image 
resolution,  we  roughly  have  1  [deg]  and  2  [m]  uncertainties  in  the  X  and  Y  components  of 
rotation  and  translation,  respectively.  In  contrast,  we  have  nearly  0.25  [deg]  and  0.6  [m] 
for  the  Z  components,  respectively.  At  first,  this  appears  surprising  as  our  earlier  results 
showed  the  X  and  Y  components  of  terrain  feature  positions  to  be  much  more  accurate 
than  the  Z  component.  Therefore,  given  the  same  level  of  accuracy  in  two  views,  we 
expect  the  X  and  Y  (translation)  motion  components,  in  comparison  to  the  Z  component. 
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to  reflect  the  same  higher  accuracy.  Why  this  is  not  the  case  is  that  embedded  in  these 
uncertainty  results  is  also  the  well-known  XjY  translation-rotation  ambiguity  of  motion 
vision.  Roughly  speaking,  a  XjY  translation  of  magnitude  T  with  respect  to  a  target  at 
distance  A  along  the  viewing  direction  and  a  YjX  rotation  of  tSin~^{T/A)  [rad]  produce 
very  similar  image  motions  (or  disparities).  Thus,  erroneous  X/Y  translation  motions 
(of  magnitude  T)  can  be  offset  by  corresponding  erroneous  Y/X  rotations  of  tm~^{T/A) 
[radj.  Referring  back  to  fig.  7, if  the  X/Y  translation  has  a  variance  of  roughly  2  [m]  (when 
utilizing  12  QUXGA  cameras)  with  respect  to  a  target  at  about  100  [m],  it  can  be  offset 
with  a  Y/X  rotation  with  variance  tan“^  0.02  [rad]«  1  [degj.  This  can  be  readily  verified 
from  the  XY  rotation  uncertainties  for  12  cameras  (top-left  and  middle-left  plots  for  12 
QUXGA  cameras).  As  another  test,  we  have  a  variance  of  12  [m]  in  X/Y  translation 
with  6  low-resolution  VGA  cameras.  This  can  be  offset  -with  Y/X  rotation  variance  of 
tan“^  0.12  [rad]»  7  [deg],  which  is  the  calculated  uncertainty  for  these  rotations. 

Overcoming  the  rotation-translation  ambiguity  of  visual  motion  typically  calls  for  an 
imaging  system  with  a  large  field  of  view,  and  terrain  features  with  disparate  depth  values. 
While  the  multi-camera  conical  imaging  systems  accommodates  a  relatively  large  field 
view,  the  inability  to  accurately  estimate  the  depth  Z  of  terrain  features  works  against  us. 
One  solution,  quite  appropriate  for  UAVs,  is  to  determine  the  rotational  motions  from  high- 
precision  gyros  (e.g.,  RLG)  that  are  typically  installed  on  these  platforms.  With  rotation 
known,  the  translational  motion  can  then  be  estimated  with  more  precision.  Determining 
translation  from  the  closed-from  solution  of  absolute  orientation,  with  known  rotation,  is 
trivial:  1)  Apply  the  known  rotation  to  the  measurements  in  one  coordinate  system,  2) 
determine  the  centroid  of  the  two  measurements,  and  3)  determine  the  difference  of  the 
two  centroids.  Having  established  that  we  can  determine  the  X  and  Y  coordinates  of 
terrain  features  with  high  accuracy  (order  of  decimeter  or  less  depending  on  the  number  of 
cameras  and  resolution),  we  expect  the  precision  to  carry  over  to  the  estimated  X  and  Y 
components  of  the  translational  motion.  Computer  simulation  presented  in  fig.  Sverifies  the 
improvement  in  the  estimated  translation,  assuming  0.1  [deg]  angle  measurement  variance 
from  the  RLG.  Included  in  the  new  results,  with  the  revised  X/Y  translation  uncertainties 
of  roughly  1  [m],  is  the  component  due  to  rotational  uncertainty.  (Based  on  om  analysis 
above,  0.1  [deg]  X/Y  rotation  uncertainty  can  lead  to  a  translation  uncertainty  of  roughly 
100  tan  0.1  »  0.17  [m]  at  a  100  [m]  terrain  elevation.)  For  consistency  with  the  results  in 
fig.  7and  to  restrict  our  study  to  the  performance  of  the  panoramic  imaging  system,  we 
concentrate  in  the  balance  of  this  section  on  the  uncertainty  measurements  where  all  the 
motion  estimates  are  determined  from  visual  cues  (assuming  no  information  from  external 
angle  measuring  devices).  As  we  see  shortly,  the  case  where  the  camera  angle  is  varied 
requires  special  attention. 

In  fig.  9,we  have  determined  the  variations  in  the  estimation  uncertainty  when  the 
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radius  of  a  twelve-camera  structure  is  increased.  It  decreases  roughly  by  a  factor  of  two, 
when  radius  changes  from  0.25  [m]  to  0.6  [m].  We  can  readily  verify  that  the  major 
contributor  to  the  uncertainty  is  the  translation-rotation  ambiguity. 

In  fig,  10, we  have  explored  how  the  uncertainty  change  with  the  density  of  the  3-D 
measurements.  On  the  horizontal  axis,  ’’Distance  between  3-D  points”  of  5,  10,  . . .  means 
that  one  out  of  5,  10,  ...  points  on  the  target  terrain  have  been  utilized  in  the  computa¬ 
tions.  Increasing  the  density  of  the  measurements,  there  is  a  one-to-one  relationship  with 
improvement  in  the  estimation  accuracy.  In  other  words,  if  twice  as  many  points  are  used, 
the  uncertainty  drops  by  a  factor  of  2. 

Finally,  we  examine  in  fig.  lithe  impact  of  the  camera  viewing  angle.  Here,  there 
appears  to  be  a  contradiction  with  earlier  results,  at  first  glance.  While  we  had  established 
that  increasing  the  camera  angles  resulted  in  some  slight  increased  uncertainty  in  the 
estimation  of  the  target  feature  positions,  the  motion  parameters  estimated  from  these 
3-D  points  seem  to  be  more  accurate.  The  explanation  goes  as  follows:  When  we  increase 
the  camera  angles,  approaching  90  [deg],  all  cameras  look  (almost)  downward  and  thus  a 
larger  region  of  the  terrain  is  viewed  by  nearly  all  of  the  cameras.  As  a  result,  there  is 
a  much  lower  uncertainty  in  the  estimated  3-D  positions  of  a  large  portion  of  the  terrain 
features,  compared  to  the  case  where  the  cameras  are  arranged  at  a  slanted  viewing  angles. 
Consequently,  we  arrive  at  a  more  accurate  motion  estimate  when  a  much  larger  percentage 
of  the  3-D  positions  are  known  with  high  precision.  This  emphasize  the  advantage  in 
utilizing  our  theoretical  model  that  accounts  for  all  of  the  key  factors.  We  have  repeated 
the  same  experiment,  assuming  again  knowledge  of  the  rotation  angles  up  to  0.1  [deg] 
uncertainty;  see  fig.  12.We  note  that  the  X/Y  translation  uncertainties  drop  by  roughly  a 
factor  of  2,  down  to  nearly  0.5  [m]  for  a  viewing  angle  of  90  [deg]. 

4.2  3-D  Reconstruction  from  Multiple  2-D  Views 

We  now  present  the  results  of  various  simulations  from  the  later  part  of  our  study:  To 
establish  accuracies  in  estimating  the  3-D  coordinates  of  terrain  features  and  the  UAV 
poses  from  images  acquired  from  multiple  large-baseline  views,  utilizing  noisy  GPS  and 
altitude  measurements  to  determine  the  UAV’s  absolute  positions.  First,  3-D  coordi¬ 
nates  of  selected  terrain  targets  are  determined  by  processing  images  at  different  UAV 
positions  along  the  flight  path,  say  roughly  every  A/k  {k  =  5  :  10)  meters,  where  A  is 
the  flight  altitude.  Three  methods  escribed  in  Appendix  2-  closed-form  eight-point  algo¬ 
rithm,  closed-form  small-rotation  solution,  and  an  iterative  method  based  on  non-linear 
constraints-  are  tested.  The  results  deal  more  generally  with  the  role  of  image  resolution 
on  performance,  independent  of  the  configuration  of  the  cameras  in  constructing  the  im¬ 
age;  whether  we  have  a  single  high-resolution  camera  with  a  large  FOV,  or  an  array  of 
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cameras  as  in  our  panoramic  system-  each  with  a  lower  resolution  and  smaller  field  of 
view-  is  immaterial.  Next,  the  located  features  are  tracked  in  image  pairs  to  determine 
the  UAV  pose.  As  stated,  gyros  can  give  the  same  information,  but  the  drift  error  can  be 
significant  in  long-duration  surveillance  operations.  Integration  of  visual  servo  with  a  gyro 
can  be  considered  as  a  mechanism  to  estimate  and  rectify  the  drift,  while  enhancing  the 
estimation  accuracy. 

4.3  Closed-Form  Eight-Point  Algorithm 

The  classic  eight-point  algorithm,  described  in  section  2.1  of  Appendix  2,  requires  (not 
surprisingly)  a  minimum  of  eight  point  correspondences  in  two  views  for  a  solution.  Our 
simulations  have  been  performed  with  N  =  {15,30,45,60}  points  over  M  =  2  :  21  views. 
In  particular,  we  have  been  able  to  establish  the  improvement  with  the  use  of  M  >  2 
frames.  In  doing  so,  the  two-view  algorithm  was  applied  to  the  data  (image  features)  at 
a  reference  frame  and  a  second  position,  estimating  the  rotation  between  the  two  views 
(using  GPS  readings  to  estimate  the  translation).  This  was  repeated  for  up  to  20  image 
pairs  from  a  total  of  21  (one  at  reference  and  20  other)  views.  These  estimated  motions- 
from  the  reference  view  to  any  one  of  20  different  views-  have  been  used  in  applying  the 
solution  in  section  2.4  of  Appendix  2,  in  order  to  1)  estimate  the  3-D  positions  of  the  terrain 
targets,  2)  compare  the  estimated  accuracy  of  tracked  target  features  for  M  =  2  :  21  views. 
The  results  of  the  simulations  for  this  section  are  found  in  Appendix  3b. 

As  an  example,  fig.  13  is  a  sample  set  of  results  for  the  altitude  of  500  [m].  In  each 
page  covering  one  GPS  variance  selection,  120  figures  depict  the  uncertainties  with  varying 
number  of  views,  image  resolutions,  and  2  choices  for  number  of  points  (a  second  set  gives 
the  same  for  two  other  choices  of  number  of  points).  The  20  cases  of  M  =  {2  :  21}  views 
have  been  given  as  a  5x4  array  of  plots  for  reconstruction  variance  -vs-  point  number. 
Thus,  each  page  contains  6  such  arrays,  where  2  side-by-side  ones  deal  with  two  choices 
of  number  of  points  (e.g.,  N  =  (15, 30}  in  one  page  and  N  =  (45, 60}  on  the  next  page). 
Going  down  the  rows,  we  cover  variations  in  image  resolution  L  =  {1,3,4}  [Kpix]. 

Before  moving  forward,  we  need  to  emphasize  that  the  following  conclusions  are  based 
on  the  application  of  the  eight-point  algorithm,  and  not  the  best  accuracy  that  can  be 
achieved  with  alternative  methods  including  those  discussed  in  subsequent  sections.  We 
start  with  the  noise- free  GPS  readings,  solely  to  concentrate  on  impact  of  image  resolution 
and  pixel  localization  accuracy; 

•  Estimation  uncertainty  generally  improves  with  the  number  of  tracked  features,  views 
and  resolution; 

•  Variation  with  the  number  of  views,  from  2  to  21,  is  significant:  By  a  factor  of  5-10 
with  15  tracked  points  ,  and  3-4  with  60  features; 
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•  Improvement  with  resolution,  from  1  to  16  mega-pixel  images,  is  very  significant  by 
a  factor  of  10  or  more; 

•  Reconstruction  variance  is  1.5-2  [m]  at  L  =  1  [Kpix]  resolution,  going  down  to  0.1- 
0.15  [m]  at  L  =  4  [Kpix]. 

The  addition  of  GPS  measurement  errors  skew  the  above  conclusions  significantly, 
both  in  terms  of  achievable  accuracy  and  the  improvements  with  the  number  of  tracked 
features,  views  and  resolution.  This  leads  to  the  conclusion  that,  in  comparison  to  the 
GPS  measurement  noises,  image  feature  detection  inaccuracies  have  a  far  less  impact  on 
the  estimation  errors  .  The  simulation  results  for  GPS  variances  of  acps  =  1  —  2  [m]  show 
that 

•  Estimation  uncertainty  generally  improves  with  the  number  of  tracked  features  and 
views,  but  not  image  resolution; 

•  Variation  with  the  number  of  views  from  2  to  21  is  roughly  by  a  factor  of  2,  when 
tracking  15-60  points; 

•  Reconstruction  variance  is  about  15  jm]  and  80  [m],  at  (Tops  =  1  /W  <^gps  — 
2  [m],  respectively,  based  on  roughly  40  (out  of  60)  tracked  points  over  21  fram.es  at 
X,  =  1  —  4  [Kpix]  image  resolution. 

We  recall  that  the  key  objective  in  the  application  of  the  current  solution  is  to  locate 
selected  3-D  terrain  targets,  and  the  simulations  were  done  primarily  to  assess  performance 
(this  is  also  why  we  have  not  addressed  the  accuracy  of  the  estimated  rotations  from  the 
eight-point  algorithm).  Our  results  clearly  indicate  that  these  accuracies  are  highly  in¬ 
sufficient  to  provide  a  suitable  mechanism  for  the  second  application,  which  is  to  track 
known  terrain  targets  for  computing  the  UAV  pose  (and/or  to  estimate  the  gyro  drift). 
Finally,  construction  of  super-resolution  imagery  is  highly  unlikely  with  such  high  inaccu¬ 
racies.  Consequently,  results  for  altitudes  2000  [m]  and  4000  [m]  have  been  performed  and 
documented  in  Appendix  3b  (figs.  14-15),  but  we  do  not  elaborate  on  them  here  as  the 
next  two  methods  provide  far  superior  performances  than  the  eight-point  algorithm. 

4.4  Small  Rotation  Angle  Approximation 

The  method  of  section  2.3  in  Appendix  2  has  been  employed  to  estimate  the  UAV  pose  and 
the  3-D  positions  of  ground  targets,  but  it  is  the  latter  that  we  are  primarily  concerned 
with  in  the  first  operational  scenario.  We  need  to  clarify  that  this  method  typically  works 
well,  either  1)  where  ffame-to-ffame  rotations  are  small,  of  the  order  of  a  few  degrees, 
or  2)  if  applied  when  one  frame  is  de-rotated  using  some  reasonable  knowledge,  up  to  a 
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few  degrees,  of  the  underlying  (potentially  large)  rotation  angle.  This  can  be  based  on 
the  (potentially  rough)  gyro  measurement  or  the  image-based  estimate  from  the  last  UAV 
position.  While  minimum  3  points  are  sufficient  to  arrive  at  a  solution,  conclusions  from 
the  experiments  with  the  eight-point  algorithm  motivated  us  to  start  with  a  higher  number 
of  matched  features;  N  =  {15,30,45,60}  as  in  the  previous  case. 

Fig.  16  in  Appendix  3c  shows  sample  results  for  an  altitude  of  500  [mj.  Again,  simu¬ 
lations  with  noise-free  GPS  readings  are  informative  in  assessing  the  impact  of  the  image 
resolution  and  feature  detection  accuracy: 

•  Estimation  uncertainty  improves  little  with  the  number  of  tracked  features,  more 
with  increasing  views  and  most  significantly  with  image  resolution; 

•  Variation  with  the  number  of  views,  from  2  to  21,  is  by  a  factor  of  about  3-4; 

•  Improvement  with  resolution-  from  1  to  16  mega-pixel  images-  is  by  a  factor  of  8-10; 

•  Reconstruction  variance  is  roughly  1.5-2  [m]  at  L  =  1  [Kpix]  resolution,  down  to 
0.1-0.15  [m]  at  L  —  4  [Kpix]  resolution. 

This  GPS  noise-free  performance  is  practically  the  same  as  with  the  eight-point  algo¬ 
rithm.  Not  surprising,  both  methods  perform  well  when  the  GPS  errors  vanish.  However, 
GPS  measurement  errors  skew  the  above  conclusions  to  a  much  lesser  extend  than  the 
previous  solution.  We  conclude  from  the  simulation  results  for  the  GPS  variances  of  acps 
of  1-2  [mj: 

•  Estimation  uncertainty  generally  improves  with  the  number  of  tracked  features  and 
views,  as  well  as  with  image  resolution  when  tracking  features  over  a  small  number 
of  views; 

•  Variation  in  accuracy,  for  2  to  21  views,  is  roughly  by  a  factor  of  2-3; 

•  Accuracy  improves  by  a  factor  of  2  from  L  =  1  [Kpix]  to  L  =  3  [Kpix]  resolution, 
with  no  further  improvement  at  L  =  4  [Kpix]  resolution. 

•  Reconstruction  variance  is  roughly  3  [m]  for  crops  =  ^  /W;  ^gps  = 

2  [mf  for  about  40  (out  of  60)  tracked  points  over  21  frames  at  L  =  1  —  4  [Kpix] 
image  resolution. 

This  performance  with  noisy  GPS  measurements  corresponds  to  a  six-fold  improvement 
with  respect  to  the  8-point  algorithm. 

For  the  intermediate  altitudes  of  2000-4000  [m]  (shown  in  figs.  17-18  of  Appendix  3c), 
the  same  general  conclusions  are  drawn.  As  a  result,  we  simply  list  the  quantitative 
performance  results: 
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•  A=2000  [m]:  3-D  localization  error  variance  is  roughly  0.2  [m]  for  noise-free  GPS 

data.  It  jumps  to  roughly  20  [m]  with  GPS  variance  of  ctgps  ~  1  AW)  AW 

crops  =  2  [mj  at  L  =  4:  [Kpix]  resolution  (for  about  40  out  of  60  tracked  points  over 

21  frames); 

•  A  =4000  [m]:  3-D  localization  error  variance  is  roughly  0.8  [m]  for  noise- free  GPS 
data.  It  increases  to  roughly  25  [m]  with  GPS  variance  of  crops  =  1  [m],  and  finally 
about  80  fm]  for  aops  =  2  [m]  at  L  =  4  [Kpix]  resolution  (for  about  40  out  of  60 
tracked  points  over  21  frames); 

4.5  Estimation  Ground  Targets  from  Nonlinear  Method 

We  now  present  the  results  from  the  application  of  the  nonlinear  method  in  section  2.2  of 
Appndix  B,  to  estimate  both  the  UAV  pose  and  the  3-D  positions  of  the  ground  targets 
(see  Appendix  3d).  Again,  we  are  solely  concerned  with  the  accuracy  of  the  3-D  targets. 

Fig.  19  in  Appendix  3d  depicts  the  results  of  various  simulations  for  the  altitude  of 
A  =  500  [mj.  For  each  case,  we  have  shown  here  sample  results  for  3  of  the  N  =  {3, 6, 9, 12} 
points  tracked.  Complete  results  for  all  of  the  points  have  been  given  in  Appendix  3e.  We 
have  used  less  features  that  in  prior  two  cases  for  several  reasons:  Jumping  ahead,  the 
results  seem  to  exhibit  asymptotic  convergence  to  a  minimum  around  AT  =  12  points.  Fur¬ 
thermore,  tracking  12  points  provides  sub-meter  terrain  localization  accuracy,  which  we 
had  aimed  for.  Furthermore,  these  simulations  based  on  an  iterative  nonlinear  algorithm 
have  been  rather  time-consuming  and  become  more  so  as  the  number  of  features  increase. 
This  imposes(ed)  a  constraint  on  how  far  we  would  carry  the  analysis,  while  also  investi¬ 
gating  the  performance  of  the  two  alternative  solutions  (we  started  with  this  solution  first, 
and  later  tested  the  other  two).  As  an  example,  typically  a  few  hundred  iterations  of  the 
non-linear  algorithm  need  to  carried  out  for  convergence.  Furthermore,  each  simulation 
has  to  be  performed  several  hundred  times  or  more,  to  establish  statistical  performance 
measures.  In  some  cases,  more  has  been  and  is  necessary  as  we  have  to  randomly  choose 
noise  samples  for  the  image  feature  positions  and  the  GPS  readings.  Finally,  these  ex¬ 
periments  have  been  repeated  for  various  resolutions  and  number  of  frames,  in  addition 
to  GPS  noise  levels.  While  the  accuracy  may  be  lowered  by  increasing  the  number  of 
terrain  features,  the  asymptotic  behavior  suggests  that  the  improvement  may  be  relatively 
marginal. 

Various  plots  have  been  arranged  in  3x3  arrays,  rows  corresponding  to  one  of  the  3 
points,  while  columns  are  assigned  to  A,  Y  and  Z  components.  For  each  altitude,  the 
results  for  tracking  N  =  {3,6,9,12}  points  have  been  arranged  on  two  pages,  N  =  3/6 
on  one  page  in  the  left/right  column,  followed  by  =  9/12  points  in  the  next  page  in 
the  left/right  column.  On  each  page,  3x3  arrays  are  arranged  row-wise  for  different  image 
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resolutions;  L  =  1,3,4  (e.g.,  first-row  3x3  arrays  in  fig.  19correspond  to  3  of  the  tracked 
points  with  iV  =  3  in  the  left  and  iV  =  6  in  the  right  for  resolution  1  [Kpixjxl  [Kpix], 
second-row  for  3  [Kpix] x 3  [Kpix],  and  last  row  for  4  [Kpix] x 4  [Kpix]). 

Once  used  to  this  arrangement  convention,  we  can  concentrate  on  each  plot  which 
shows  three  (R,G,B)  curves  for  GSP  uncertainties  of  aops  =  {0,1,2}.  The  vertical  axis, 
as  before,  is  the  estimation  uncertainty,  while  the  horizontal  axis  represents  the  number 
of  frames  used  to  track  the  feature;  the  more  frames,  the  higher  the  accuracy.  Given 
the  randomness  of  the  process  and  that  the  results  do  vary  with  the  distribution  of  the 
image  features,  there  are  some  variations  in  this  trend,  and  the  dashed  line  is  the  smooth 
approximation  of  the  expected  behavior. 

Prom  these  results,  we  conclude  the  following:® 

•  Estimation  uncertainty  improves  when  increasing  the  number  of  views,  tracked  fea¬ 
tures,  resolution  and  altitude; 

•  Estimation  uncertainty  is  slightly  lower  in  the  Z  direction; 

•  The  uncertainty  is  typically  less  than  one  meter,  when  tracking  the  features  over  15 
images. 

•  The  GPS  inaccuracy  is  less  significant  as  the  number  of  features  and  views  increase. 

The  statistics  summarizing  the  complete  results,  given  in  figs.  22-24  of  Appendix  3e 
for  the  3  altitudes,  have  been  given  in  table  2.  Small  discrepancies,  where  accuracy  is 
marginally  better  for  a  lower  GPS  accuracy,  is  mainly  due  to  the  random  selection  of 
points  in  various  simulations:  It  is  conceivable  that  some  cases  involved  an  unfavorable 
arrangement  of  feature  points-  some  clustered  within  a  local  region  of  the  image,  or  covered 
a  smaller  field  of  view  (as  we  discussed  in  presenting  the  results  from  the  earlier  two 
methods).  The  impact  becomes  more  evident  when  less  number  of  features  are  utilized  as 
in  these  simulations,  in  comparison  to  the  previous  methods. 

However,  the  main  conclusion  should  be  based  on  the  order  of  magnitude  of  these 
numbers,  considerably  lower  than  the  results  based  on  the  closed-form  solutions  with  a 
larger  number  of  tracked  features®.  Thus,  it  is  possible  to  achieve  sub-meter  accuracy  with 
one  to  two  dozen  of  features,  ideally  distributed  over  as  large  of  a  FOV  as  possible,  tracked 
in  one  to  two  dozen  frames. 

^Recall  that  the  results  for  an  altitude  of  500  [m]  have  been  determined  for  a  feature  localization 
uncertainty  of  <r/  =  1,  in  comparison  to  cr  =  1/3  for  the  other  2  altitudes,  as  a  more  stringent  test. 

®The  higher  inaccuracy  of  the  eight-point  algorithm  is  not  surprising  where  the  depth  variations  among 
the  scene  points  are  small.  Additionally,  we  cannot  incorporate  knowledge  from  the  GPS  readings  until 
the  essential  matrix  Eij  =  [ty  ]  Rij  has  already  been  calculated. 


20 


Altitude 

H 

GPS  Err 
crops  [m] 

XY  Uncertainty  [m] 

1  Kpix  3  Kpix  4  Kpix 

Z  Uncertainty  [m] 

1  Kpix  3  Kpix  4  Kpix 

500 

0.7 

0.8 

0.9 

■9 

0.3 

1 

1.1 

0.9 

0.9 

0.3 

■9 

0.3 

2 

1.7 

1.3 

0.9 

0.3 

0.3 

0.3 

2000 

0 

1^ 

■9 

1 

■■ 

wgm 

^^9 

2 

1. 

0.4 

0.4 

0.1 

0.1 

4000 

0 

0.4 

0.2 

0.1 

0.1 

0.1 

0.1 

1 

0.4 

0.2 

0.2 

0.2 

0.1 

0.1 

2 

0.5 

0.4 

0.4 

0.1 

0.1 

0.1 

Table  1:  Variances  of  feature  positions  and  variations  with  resolution  and  GPS  uncertainty  for  3 
altitudes.  Some  discrepancies,  where  accuracy  is  slightly  better  for  a  lower  GPS  accuracy,  is  due 
to  the  random  selection  of  points  in  various  simulations.  See  text  for  more  details. 


4.6  Pose  Estimation;  Known  Ground  Targets 

Applying  the  nonlinear  optimization  method  described  in  section  2.5  of  Appendix  2,  we 
have  determined  the  camera  pose  by  tracking  a  number  of  terrain  targets  whose  positions 
are  known  up  to  some  uncertainty.  We  have  adopted  the  achievable  accuracy  in  locating 
ground  targets  from  the  results  of  the  previous  section  at  each  altitude,  in  addition  to  the 
assumed  GPS  uncertainty  and  image  feature  localization  errors,  to  perform  simulations  for 
determining  the  error  variances  of  the  UAV  pose.  The  angles  are  expressed  as  components 
of  a  3-D  rotation  vector  u  computed  from  the  rotation  matrix  R  by  the  Rodriguez  formula; 
see  section  1.2  of  Appendix  1.  We  have  investigated  the  variations  with  the  number  of 
terrain  features  and  the  image  resolution.  Drawing  from  the  results,  given  in  fig.  25 
collectively  for  the  three  altitudes: 

•  Estimation  error  variance  decreases  with  increasing  image  resolution; 

•  Improvements  in  error  variance  with  image  resolution  grow  with  GPS  error:  that  is, 
there  is  more  improvement  in  accuracy  from  increasing  the  image  resolution,  when 
GPS  measurements  are  less  accurate. 

•  Estimation  uncertainty  improves  with  the  number  of  features  tracked.  It  is  unstable 
when  minimum  of  3  features  are  used,  as  it  highly  depends  on  the  configuration  of 
these  points.  It  levels  off  at  around  30  features; 

•  Accuracy  in  estimating  the  heading  angle  is  much  higher  than  for  pitch  and  roll  (XY) 
components.  In  fact,  it  approaches  that  for  noise-free  GPS  readings  with  increased 


21 


Altitude 

[ml 

GPS  Err 
(XGPS  IH 

Uxy  uncertainty  [deg] 

1  Kpix  3  Kpix 

1  ujz  un 
4  Kpix 

certainty  [deg] 

1  Kpix  3  Kpix 

4  Kpix 
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■S9 
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mm 

2 

0.71 

0.63 
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0.32 
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0.02 

0.01 
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99 

■■ 

1 

0.35 

0.31 

0.20 
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2 

0.53 

0.40 

0.40 

0.06 

0.04 

0 

0.05 

0.02 

0.01 

— 1» 

1 

0.20 

0.15 

0.15 

0.01 

2 

0.36 

0.31 

0.30 

^SSSSS^^S 

^SSSSSSSS 

Table  2:  Pose  angles  estimation  variances  by  tracking  known  terrain  targets  at  various  altitudes. 
See  text  for  more  details. 


resolution  and  number  of  tracked  points.  In  contrast,  the  pitch  and  roll  accuracy 
with  erroneous  GPS  readings  never  approaches  the  noise-free  case.  This  is  expected 
due  to  the  impact  of  GPS  uncertainty  in  determining  the  UAV  translational  motion, 
subsequently  affecting  the  interpretation  of  rotations  due  to  the  translation-rotation 
ambiguity; 

•  GPS  error  becomes  less  significant  at  higher  altitudes. 

Table  2  summarizes  the  variances  for  various  altitudes,  image  resolutions,  and  GPS 
uncertainties.  With  GPS  error  variance  at  crops  =  1  M,  we  can  achieve  roughly  0.15- 
0.25  [deg]  in  XY  and  0.01-0.05  [deg]  in  Z  components  of  the  rotation  u  for  the  image 
resolution  of  4  [Kpixfxf  [Kpix]. 

While  we  have  not  reported  results  here,  our  limited  simulations  indicate  that  the 
estimation  error  variances  are  rather  sensitive  to  the  uncertainty  in  the  UAV  altitude  mea¬ 
surements.  Thus,  improvement  can  be  significant  with  the  use  of  high-precision  altimeters. 

5  Discussions,  Conclusions  and  Recommendations 

The  so-called  com.putational  camera  is  undoubtedly  a  very  powerful  and  flexible  paradigm 
to  achieve  a  seemless  unification  of  imaging  sensors  and  the  computer.  It  accommodates 
the  capability  to  overcome  various  complexities  and  (or)  shortcomings  of  existing  imaging 
systems,  with  flexible  reconfigurable  arrangement  of  several  low-cost  conventional  cam¬ 
eras,  in  order  to  generate  super  resolution  imagery,  in  theory  over  the  entire  sphere  of 
viewing  directions.  Computational  powers  of  today’s  processors  currently  forbid  real-time 
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realization,  but  will  be  possible  in  near  future  with  the  tremendous  growth  of  processors 
at  diminishing  cost. 

This  work  primarily  concentrated  on  the  performance  assessment  of  a  panoramic  sys¬ 
tem  in  a  range  of  design  parameters-  e.g.,  resolution  and  number  of  cameras-  that  can 
provide  real-time  high-resolution  imaging  and  (or)  transmission  to  a  central  command 
station,  based  on  technologies  available  currently  or  the  very  immediate  future.  In  fact, 
performance  results  based  certain  parameter  ranges  in  our  analysis-  say,  1-4  [Kpix]xl- 
4  [Kpix]  image  resolution  over  90  [deg]  in  FOV-  may  be  viewed  as  somewhat  conservative. 

For  the  earlier  part  of  the  work-  studying  the  performance  of  the  panoramic  system  for 
terrain  imaging-  theoretical  models  of  feature  localization  and  motion  estimation  accura¬ 
cies  have  been  derived  and  utilized.  We  have  used  as  the  primary  benchmark  the  ability  to 
localize  a  terrain  feature,  and  studied  how  the  localization  uncertainty  varies  as  a  function 
of  a  number  of  imaging  system  parameters.  The  results  have  several  implications  in  the 
application  of  the  multi-camera  technology  for  aerial  imagery.  First,  they  provide  quanti¬ 
tative  measures  on  the  trade-offs  in  the  number  and  resolution  of  the  cameras.  Next,  they 
enable  designing  a  system  that  provides  optimal  performance  for  a  given  altitude  range. 
Also,  we  can  conclude  what  information  can  be  extracted  with/without  acceptable  accu¬ 
racy.  For  example,  X/Y  coordinates  (relative  to  the  UAV)  can  be  established  with  high 
accuracy  from  multi-camera  system  at  one  view,  and  variations  in  X/Y  accuracy  can  be 
significant  with  the  number  of  cameras.  In  contrast,  given  that  the  camera  separations  in  a 
cluster  is  limited  by  the  UAV  size,  we  cannot  extract  from  the  multiple  overlapping  images 
useful  information  about  the  vertical  distance  {Z  coordinate).  Thus,  we  cannot  determine 
the  vertical  size  of  terrain  features  from  the  panorama  at  one  view.  Yet,  a  super-resolution 
panorama  can  be  generated  by  the  integration  of  multiple  images  based  on  a  simplistic 
locally  planar  terrain  model,  and  we  can  evaluate  the  improvement  in  resolution  according 
to  our  quantitative  results  based  on  the  number  of  cameras.  Furthermore,  it  has  been 
shown  that  estimate  in  the  vertical  direction  can  be  determined  reliably  by  the  application 
of  motion  vision  techniques,  given  several  panoramas  (at  assumed  resolution  and  field  of 
view)  from  multiple  (dozen  or  more)  UAV  positions  along  its  trajectory.  Specifically,  error 
variance  of  0.3  [m]  or  less  can  be  achieved,  in  the  500-4000  [m]  altitude  range  studied. 

In  assessing  the  imaging  system’s  performance  in  computing  UAV  motions  from  esti¬ 
mated  3-D  coordinates  of  terrain  features  at  each  view,  we  have  utilized  the  variances  of 
the  estimated  motion  parameters  based  on  a  close-form  solution  to  the  absolute  orienta¬ 
tion  problem.  We  have  verified  significant  improvements  with  increased  camera  resolution, 
number  of  cameras,  density  of  terrain  features  used  for  motion  estimation,  as  well  as  the 
viewing  angle.  More  importantly,  however,  utilizing  high-precision  gyros  to  determine  the 
UAV  pose  at  each  view  will  result  in  the  most  significant  impact,  allowing  us  to  estimate 
the  translational  motion  components  with  much  better  accuracy.  The  use  of  3-D  fea- 
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ture  coordinates  from  each  view  to  compute  3-D  motion  with  acceptable  accuracy  has  the 
following  drawbacks: 

•  Limited  to  very  low  altitudes,  say  100  [m]; 

•  Requires  a  very  large  density  of  features; 

Alternatively,  we  studied  the  use  of  GPS  readings  along  the  UAV  flight  path  in  determining 
the  3-D  coordinates  of  terrain  features.  Subsequently,  2-D  visual  cues  by  tracking  the 
features  across  images  enables  us  to  estimate  the  UAV  pose.  This  approach  based  on  2-D 
image  projections  of  terrain  features  works  well  for  higher  altitudes  while  having  to  track 
only  about  3  dozens  of  points. 

We  can  list  some  remedies  for  improvements  based  on  our  findings,  but  did  not  inves¬ 
tigate  them  further  in  this  study,  in  part  due  to  project  time  limitations  and  in  part  for 
not  combating  to  meet  imposed  accuracy  bounds  or  limits: 

•  Positioning  performance  at  lower  altitudes  can  be  made  better  by  revising  a  ’’sim¬ 
plistic”  assumption  made  in  our  simulations:  We  have  used  the  computed  error 
variances  of  the  terrain  feature  positions  at  each  altitude  (section  4.5)  to  determine 
the  variances  of  pose  parameters  for  the  same  altitude.  However,  given  that  we  can 
determine  the  3-D  coordinates  of  target  features  more  accurately  at  higher  altitudes 
(because  GPS  errors  are  less  significant  as  a  percentage  of  the  altitude),  3-D  ter¬ 
rain  coordinates  may  be  first  determined  from  the  images  at  higher  altitudes  (e.g., 
A  =  4000  [m]  or  higher)  and  subsequently  used  to  more  accurately  estimate  pose  at 
lower  altitudes. 

•  Higher  resolution  at  lower  altitudes  for  pose  estimation,  though  the  accuracy  for  the 
intermediate  (and  higher)  altitudes  appears  to  be  leveling  off  at  the  resolutions  we 
have  explored; 

•  Establishing  UAV  positions  more  accurately,  as  limited  GPS  and  altimeter  accuracies 
are  key  factors  in  higher  estimation  variances; 

•  Locating  landmarks  and  fixed  terrain  targets  with  better  precision  using  other  sources. 

•  Integration  of  gyro  and  visual  servo  should  not  only  produce  better  results,  but 
enable  the  estimation  and  overcoming  of  gyro  drift. 

Certain  algorithmic  modifications  are  likely  to  enhance  performance.  For  example, 
the  later  part  of  our  work  on  target  positioning  and  UAV  pose  estimation  is  based  on 
the  knowledge  of  2-3  dozens  of  3-D  positions  in  one  coordinate  system  (reference  view) 
and  the  2-D  projections  in  two  views.  However,  2-D  correspondences  of  numerous  other 
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image  features,  without  knowledge  of  their  3-D  coordinates,  can  provide  more  constraints 
in  determining  the  pose  parameters  (epipolar  constraint  of  the  multiple  view  geometry, 
discussed  in  Appendix  2).  For  very  low  altitudes,  say  100  [m]  or  less,  the  binocular  cues 
from  multiple  cameras  of  the  panoramic  imaging  system  may  be  utilized.  Finally,  terrain 
relief  is  often  negligible  relative  to  the  UAV  altitude  at  mid  to  high  altitudes,  and  thus  the 
surface  may  be  treated  as  a  flat  plane.  For  this  case,  closed-form  solution  for  planes  can 
sometimes  provide  a  more  accurate  estimate  (25,  6,  17].  ^Ve  have  not  tested  the  solutions 
for  planar  surfaces  because  we  have  concentrated  on  solutions  that  impose  no  restriction 
on  terrain  topography. 

Having  said  these,  a  number  of  factors,  not  the  target  of  this  study,  can  result  in  some 
deterioration  in  actual  performance.  For  example,  we  have  not  directly  considered  inaccu¬ 
racies  associated  with  system  calibration.  As  the  image  resolution  increases,  determination 
of  the  imaging  system  internal  and  external  parameters  become  increasingly  important. 
For  example,  since  we  can  achieve  the  same  resolution  over  a  fixed  FOV  in  several  ways, 
there  are  certain  trade-offs  in  1)  achieving  high-precision  external  calibration  of  a  cluster 
of  cameras  at  moderate  cost,  each  with  lower  FOV  and  negligible  distortion;  2)  intrinsic 
calibration  of  a  single  or  a  smaller  number  of  cameras  at  higher  resolutions  and  distortion 
rates;  3)  Avoiding  performance  degradation  due  to  calibration  errors  by  improving  the 
unit  camera  performance,  say  distortion  rate,  at  higher  cost.  Some  of  these  questions  may 
be  answered  through  computer  simulation,  e.g.,  assuming  higher  variances  for  2-D  image 
feature  positions  and  (or)  a  spatially  varying  uncertainty  model  that  would  represent  the 
errors  due  to  image  distortion  effects. 

In  conclusion,  we  have  tried  to  study  a  number  of  key  issues  that  play  a  role  in  the 
performance  of  a  high-resolution  panoramic  imaging  system  under  certain  UAV  operational 
conditions.  The  underlying  issue  is  to  establish  if  image  resolution  need  or  should  be 
increased,  and  the  pay-off,  to  meet  certain  levels  of  accuracy.  We  believe  that  the  results 
do  generally  provide  a  good  assessment  for  the  performance  of  a  high-resolution  vision 
system.  However,  given  the  number  of  variable  factors,  some  of  which  not  addressed  in 
this  study,  the  design  and  simulations  of  a  system  should  directly  take  into  account  the 
specific  requirements  of  the  application  (e.g.,  image  resolution),  limitations  (e.g.,  weight, 
size,  UAV  capabilities),  desired  accuracies,  etc.  Consequently,  one  can  address  trade-off 
issues  while  simultaneously  designing  conservatively  to  overcome  unaccounted  factors. 
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Appendix  1:  Theoretical  Estimation  of  Uncertainty 

Bounds 

We  present  here  the  theoretical  results  on  the  estimation  of  uncertainty  bounds  to 
establish  accuracy  in  3-D  reconstruction,  motion  estimation  and  positioning. 


1.1  3-D  Reconstruction  by  Triangulation 

Given  the  corresponding  points  on  two  images  and  the  camera  projection  matrices’^, 
the  point  in  3-D  world  can  be  located  by  triangulation.  Let  X  =  p[X,  Y,  Z,  1]^,  xi  = 
Ml [xi, 2/1,1],  and  X2  =  M2b2,?/2,1]  be  a  3-D  point,  and  its  projections  onto  cameras  1  and 
2,  respectively,  based  on  the  camera  projection  matrices  and  V^: 


/  XI  =  piX 

\  X2  =  p2x 


(1) 


From  many  methods  to  solve  the  above  equations  for  X  [15],  a  close-form  solution  is 
most  common.  First,  V  in  (1)  is  combined  into  the  form  AX  =  0,  which  is  linear  in  X. 
Eliminating  the  homogenous  scale  factor  in  P  by  a  cross  product  gives  three  equations  for 
each  image  point,  of  which  two  are  linearly  independent.  For  example,  for  the  first  image, 
Xi  X  (P^X)  =  0  gives 


r  xipj^^x-pl^x  =  0 

<  yiPTx-PTx  =  0  (2) 

(xiTVX-yiT^X  =  0 


where  are  the  rows  of  P^  Rewriting  the  above  equations  in  the  form  ^X  =  0,  where 


rciPT  -  T’V  ■ 
J/iPf  -  pT 
a:2pr-Pf 
y2pr-PT  . 


(3) 


we  obtain  the  desired  solution  in  the  form  [  pX  pY  pZ  pY  from  the  singular  value 
decomposition  of  A. 

With  a  cluster  of  N  cameras,  we  have  multiple  observations  Xj  =  1],  i  = 

1,2,...,//,  and  matrix  A  can  be  rewritten  as 

’’A  camera  projection  matrix  can  be  readily  determined  through  a  priori  calibration  experiments. 
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(4) 


A  = 


xiVf  -rr 
X2rr-rr 

y^vf  -Vf 


xNvr-'Pi^ 

VnV^ 


Let  W  =  —  VDV^  denotes  the  eigenvalue  decomposition  of  the  4x4  matrix 

W,  where  V  is  a  4x4  orthogonal  matrix,  and  D  =  diag{Xi,  X2,  X3,  X4}  is  the  diagonal 
matrix  of  eigenvalues  in  descending  order.  It  can  be  readily  shown  that  the  3-D  point  X 
can  be  determine  from  the  eigenvector  Vi  =  [vi,i'2) ^3) ^4]^  corresponding  to  the  smallest 
eigenvalue  A4. 


1.1.1  Accuracy  in  3-D  Reconstruction 

We  have  shown  that  we  can  estimate  the  position  of  a  3-D  space  point  from  multiple 
observations  in  several  cameras.  However,  we  are  interested  to  know  of  the  accuracy  of  the 
solution,  given  noisy  measurements  and  the  finite  precision  due  to  the  spatial  quantization 
of  the  image  at  a  particular  resolution.  The  answer  can  be  found  from  the  covariance  matrix 
Cx  of  the  estimated  3-D  point  X,  based  on  the  errors  in  the  measurements  Xf  =  yLi[xi,  y,,  1], 
i  =  l,2,...,N. 

If  we  define  the  measurement  vector  v  =  [xi,yi,X2,y2,  ■  ■  ■,XN,yNY',  with  ax^  and  ay., 
i  =  1, 2, . . . ,  A,  as  the  variances  of  these  measurements,  the  covariance  Ca  of  these  elements 
are  given  by  [12]: 


where 


and 


r  I  _  dAr  dA'^ 
ta|l6xl6  — 


dxi 

gQl.2 

dxi 

da\,i 

dyi 

gQl.2 

dyi 

dai,i 

dxj^ 

0^1,2 

dxf^ 

^Ql,l 

dVN 

OQ‘1,2 

dVN 

da  A,  A 
_  dxi 

da  A, A 
dyi 

daAA 

dxfj 

da  A,  A 
dVN 

l6x2N 

(5) 


(6) 


^Qm,n 

dXi 

^Qm,n 

dyi 


Next,  it  can  be  shown 
(in  projective  space)  is 


2Xi^ i,Z,rnP i,3,n  {.'P i,Z,TnP i,l,n  ^ i,3,n^i,l,m) 

^yi'Pi,Z,rnPi,Z,n  —  {P  i,Z,m‘P  i,2,n  +  'Pifi,riPi,2,m) 

that  the  covariance  of  the  reconstructed  point  [pX,  pY,  pZ, 


(7) 
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Cx  =  (8) 

where 

$4  =  diag{{X4  -  Al)~^(A4  -  A2)"^(A4  -  A3)“^0}  (9) 

Hvi  =  [  ^’1^4x4  «2jf’4x4  ^'3-^4x4  U4i^4x4  ]4)<i6  (1®) 

where  74x4  is  a  4  x  4  identity  matrix. 

Finally,  the  covariance  of  the  Euclidean  point  [X,  Y,  is  Cx  =  JpCxJp,  where  Jp  is 
the  Jacobian  of  the  transformation  from  projective  to  Euclidean  space; 

r i  0  0  -4 

0^0  -4 
P 

_  0  0  i 

These  analytical  results  allows  us  to  estimate  the  reconstruction  accuracy  of  any  3-D 
point  on  the  terrain  X,  given  its  projections  Xj  {i  =  1,2, . . .  ,N)  into  the  N  images  of  the 
camera  cluster.  Arbitrary  arrangement  of  the  7V-camera  cluster,  camera  resolution,  and 
terrain  altitude  are  encoded  in  the  projection  matrix  Vi  of  the  i-th  camera,  and  thus  can 
be  simulated. 


1.2  3-D  Motion  Estimation  by  Absolute  Orientation 

We  have  already  discussed  the  reconstruction  of  a  point  in  3-D  space  from  two  or  more 
image  observations.  Now,  suppose  we  know  the  coordinates  of  some  3-D  points  Xi,i  (i  = 
1,2, ...,L)  at  one  position  of  the  panoramic  imaging  system,  move  to  a  new  location, 
and  determine  the  position  X2,i  of  the  same  points  in  the  new  camera  coordinate  system. 
Absolute  orientation  is  the  problem  of  using  the  measurements  Xi,i  =  [Xi,f,  Yi^i,  and 
X2,i  =  [X2,i,y2,i,'^2,i]^  of  three  or  more  points  (i  =  1,2, . . . ,  L  >  3)  to  determine  the 
motion  of  the  imaging  system  between  the  two  views.  This  problem  has  a  closed-form 
solution  |3,  20,  21]. 

Let  Si  and  S2  be  the  two  sets  of  measured  3-D  points: 


X2,l  1^2,1  ^2,1 

X2,2  1^,2  ^2,2 

S2  =  . 

.  X2,L  Y2,L  Z2,L 


X2  Y2  Z2  ] 


(12) 
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We  are  looking  for  a  transformation  from  the  coordinate  system  1  to  coordinate  system  2 
in  the  form 


■  ^2.i  ■ 

■  ■ 

^2,i 

=  R 

'^2,i 

i  =  l,2,...,N 


(13) 


where  R  and  T  describes  the  motion  of  the  imaging  system  is  terms  of  a  3  x  3  rotation 
matrix  and  3x1  translation  vector.  These  unknowns  can  be  computed  trough  a  least 
mean  square  error  optimization  over  all  the  points: 


arg  min 
R,T 


(14) 


under  the  constraint  RJ R  =  I. 

It  has  been  shown  that  the  optimum  orthonormal  matrix  R  can  be  estimated  from 
matrix  M  [3]: 


M3x3  =  iS2-S2ViSl-Si) 

=  {X2-X2  Y2-Y2  Zz-ZaTlXi-Xi  Yi-Yi 

=  [  x;  ^2  z;  ]T[  x'l  y'l  Z[  ] 

=  y'^x\  y'^y\  y'^z\ 

.Z'Tx'i  Z'^Y\  z':^z\. 


Zi  -  Zi  ) 


(15) 


where  {.}  is  the  arithmetic  mean  of  the  values.  If  M  =  UDV'^  is  the  singular  value 
decomposition  of  M,  then  R  and  T  are  given  by: 

_  r  UV'^  when  detiUV'^}  =  +1 

\  1/  diag{l,  1,  -1}  V'^  when  det{UV^}  =  -1  (16) 

T  =S2-RSi 

Using  Rodriguez  formula,  we  can  consequently  determine  from  R  the  rotation  vector  u>: 


e 


UJ  = 


2sin0 


[  r32  —  r23  ris  —  rsi  r2i  —  ru 


(17) 


where  9  =  cos~^ {{trace{R)  -  l)/2),  and  Uj  for  f  =  1,2,3  and  j  =  1,2,3  are  the  elements 
of  the  rotation  matrix: 


rii  ri2  riz 

R  =  I  r2i  r22  r23 

rsi  r32  rzz 

Conversely,  we  can  determine  the  rotation  matrix  R  from  ui: 


(18) 


R(w)  =  cos  07  +  (1  -  cos0)tja)''^  +  sin0[w]x 


(19) 
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where  Q  =  |aj|  is  the  angle  of  rotation,  Cj  =  ijj/\u}\  is  the  unit  vector  along  the  axis  of 
direction  uj,  and  [c!>]x  is  the  skew  symmetric  matrix 


[w]x 


0  -0)3  +(I)2 

+0)3  0  -wi 

-(2)2  +(2)1  0 


(20) 


1.2.1  Accuracy  in  Motion  Estimation 

The  precision  in  localizing  selected  3-D  terrain  points,  studied  in  section  1.1.1,  directly 
affects  how  accurately  we  can  compute  the  motion  of  the  imaging  system  between  two 
nearby  positions.  We  can  quantify  the  accuracy  of  the  motion  estimation  from  the  absolute 
orientation  solution  in  terms  of  the  covariance  matrices  of  u>  and  T.  The  derivation  of  these 
results  uses  the  mixed  model  least-squared  adjustment  approach  from  the  photogrammetry 
literature  [22,  23,  36]. 

Let  S\  and  S2  be  the  two  sets  of  measured  3-D  points: 


^1,1 

Zl,l  " 

Ai,2 

n.2 

Zifi 

=  1X1  Yi 

Zi  ] 

. 

Yi,n 

Zl,N  . 

'  X2,l 

Y2,1 

■^2,1 

X2,2 

Y2,2 

Z2,2 

=  [  A2  Y2 

Z2 1 

.  -^2,V 

Y2,N 

Z2,N  . 

(21) 


with  Cs^,S2  representing  the  6L  x  6L  covariance  matrix  of  the  3-D  points  in  the  two  coor¬ 
dinate  systems  (see  results  from  section  1.1.1): 


CsiSi 


Cs, 


SNxeN 


(22) 


Recall  that  the  measurements  Xi,i  =  and  X2,i  =  [X2,i,l2,i)-^2,i]^  in  the 

two  coordinate  systems  of  the  imaging  system  are  related  by 


Xi,i  1 

■  ^2.i  ■ 

R 

-IT- 

^2.i 

^i.i 

■^2,1 

/([Xi,i  y2.i  Zr,i  X2,i  Y2,i  n  =  /(7n.x)  =  0 

> - - - ✓  N - ^ 

m  X 


(23) 


Determining  the  solution  to  /(m,  a;)  =  0  for  x  is  in  fact  in  the  form  of  a  Mixed-Model 
Least  Squares  Adjustment  problem.  To  derive  the  covariance  matrix  of  the  estimated 
parameters,  we  define 
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and 


Bi 


dm 


=  [  R  —hxB 


3x6 


(24) 


Ai  = 


dx 


m(0),x(0) 


[  ■X'l,t/3x3  yi,ihxi  Zi^ihxZ  hxz 


3x12 


Jr 


hx3 


Jl2x6 


(25) 


where  Jr  =  the  Jacobian  calculated  from  the  Rodriguez  rotation  formula. 

Matrices  A  and  B  can  be  defined  by  assembhng  the  matrices  Ai  and  Bi  together: 

A=[  Aj  Aj  ...  Ajj 


B  = 


Bi 


B2 


Bn 


(26) 


ZNx6N 


Finally,  the  covariance  matrix  of  the  estimated  parameters  a;  and  T  is  in  the  form 


C^,T=  {A'^{BCs„s,B'^)-'^A)  ‘  (27) 

Special  Case:  If  we  assume  an  additive  Gaussian  noise  with  normal  distribution 
N{0,a)  for  the  3-D  points  in  sets  Si  and  S2,  (27)  simplifies  to 


C,T  =  2(t2(AT^)-1  (28) 

This  equation  shows  that  the  covariance  matrix  of  the  final  estimations  depends  on 
the  configuration  of  the  3-D  points  and  the  noise  level  of  their  measurements,  as  well  as 
the  Jacobian  matrix  of  the  rotation  matrix.  The  above  equation  can  be  simplified  further 
under  the  small  rotation  assumption  as  follow; 


where 


Cu,,T  =  2^2 


Cn 

Cl2 

.  ^12 

C22  . 

(29) 
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Cn 


{{(xjxi  -  Nxl)  +  (yj'ri  -  nyI)  +  {zjzi  -  Nzl))h^3- 

XjXi-NXl  XjYi-NXiYi  XjZi-NXiZi' 
XjYi-NXiYi  YJYi-NyI  YjZi-NYiZi  } 
XjZi-NXiZi  YjZi-  NYiZi  ZjZi-NZl 


-1 


Cn 


0  ■ 

+E^i.i  0  . 


(30) 


( 

0 

-E^i.i 

+E^i.i  ■ 

\ 

1  hx3  + 

+  E 

0 

-E^i.i 

C12 

V 

.  -EYi,i 

+E^i.i 

0 

/ 

C22  ^ 

Not  surprising,  the  accuracy  of  the  estimated  motion  parameters  highly  depends  on 
the  distribution  of  the  3-D  points  in  space. 


1.3  Positioning 

If  the  images  from  multiple  nearby  positions  are  to  be  fused  to  construct  a  larger  composite 
view,  we  need  to  know  these  positions  of  the  imaging  system,  relative  to  some  reference 
frame.  Without  loss  of  generality,  let  us  assume  that  the  reference  frame  is  the  very  first 
position  of  the  imaging  system.  The  positions  Pi  and  Pj+i  in  a  global  coordinate  system- 
usually  chosen  as  the  initial  trajectory  position-  of  the  vision  system  at  time  instants  i 
and  i  +  \  are  related  by 

Pi+-,^RiPi  +  Ti  (31) 

1.3.1  Accuracy  in  3-D  Positioning 

We  are  interested  in  finding  the  covariance  matrix  of  Pj+i ,  when  the  covariances  of  motion 
parameters  and  Pj  are  known.  Let  Vi  be  the  variables  of  this  transformation,  i.e.,  Vi  = 
[  u}J  TJ  PJ  ]^.  The  Jacobian  matrix  of  this  transformation  is  in  the  form 


JPi^.1  —  — =  [  XilsxS  YilsxS  ZilsxS  ^3x3  Pi  ] 
giving  the  covariance  matrix  of  the  estimated  position: 


3x15 


JUi 


hx6 


15x9 


CPi+l  —  +  —  Jp, 


1+1 


Cpi  \ 


1+1 


(32) 


(33) 


with  Cpo  =03x3- 
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Appendix  2:  3-D  Reconstruction  from  Multiple  2-D 

Views 


The  estimation  of  motion  from  image  sequences  is  one  of  the  most  important  and  highly 
investigated  problems  in  vision  literature®.  While  it  is  impossible  study  many  methods  in 
the  limited  time  of  this  effort,  given  that  we  also  have  to  explore  the  impact  of  a  number 
of  parameters,  three  relatively  different  approaches  have  been  tested,  each  allowing  us  to 
incorporate  GPS  readings  in  the  motion  estimation  process  in  ne  way  or  another. 

The  oldest  solution  of  the  motion  problem  in  vision  literature  may  be  the  classical  eight- 
point  algorithm  [24],  which  has  been  studied  extensively  [16,  17].  Here,  correspondences  of 
a  minimum  of  8  points  in  two  images  are  used  to  determine  both  the  motion  of  the  camera, 
as  well  as  the  3-D  positions  of  these  points  in  space.  While  the  original  formulation  has  led 
to  other  solutions  with  less  number  of  points,  say  a  nonlinear  seven-point  solution  [2,  34], 
we  have  adopted  this  method  since  it  provides  a  closed-form  solution.  It  is  described  here 
for  completeness,  so  we  would  show  how  we  incorporate  knowledge  of  GPS  measurements, 
rather  than  determining  it  from  the  solution®.  In  the  second  approach,  we  incorporate  the 
GPS  readings  directly  into  the  motion  model,  and  consequently  estimate  the  sought  after 
rotational  from  a  non-linear  estimation  process.  Finally,  we  apply  a  closed-form  solution 
based  on  the  small  rotation  approximation-  utilizing  a  constraint  similar  to  the  differential 
optical  flow  model.  To  the  best  of  our  knowledge,  a  similar  solution  has  not  been  proposed 
before. 

2,1  Closed-Form  Eight-Point  Solution 

Consider  the  projection  p*.  =  {xk,  VkV  of  ^  scene  point  P  =  {X,  Y,  Z)"^: 

Pk  =  CkP, 

where  =  denotes  up  to  scale  equality,  p^  =  (p^^,  1)’’  and  P  =  (P^,  1)^  are  the  homogenous 
coordinates  of  the  image  point  p^  and  scene  point  P,  respectively  (p  is  determined  from 
Pfc  by  dividing  by  the  3rd  component),  and  Ck  is  known  as  the  camera  matrix; 

Ck  =  MintK[Rok\tok]- 

Here,  Mm  is  the  3x3  matrix  of  camera  intrinsic  parameters^®,  K  =  [/sxalOsxi],  and 
{Pofcl^ofc}  describe  the  pose  and  position  of  the  camera  at  position  k,  relative  to  the  refer- 

®Hartley  and  Zisserman  [17]  is  the  probably  the  best  reference  for  many  of  the  most  recent  feature-based 
methods. 

®Our  simulations  show  that  we  would  obtain  more  accurate  results,  given  a  relatively  large  distance-  in 
the  order  of  1/5-1/10  of  the  fight  altitude-  between  two  aircraft  positions,  known  with  variances  of  about 
1-2  m. 

i^We  assume  a  calibrate  imaging  system,  and  thus  this  is  known. 
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ence  coordinate  system.  It  is  well-known  that  and  pj  satisfy  the  epipolar  constraint 

Pj  EijPi  =  0, 

where  Eij  =  [Uj]^  Rij  is  commonly  known  as  the  essential  matrix,  and  3x3  matrix  [t]^ 
represents  the  3-D  vector  t  as  a  3x3  skew-symmetric  matrix,  such  that  [t],^x  —  t  x  x  for 
any  3-D  vector  x).  With  8  or  more  point  correspondences  {Pi,Pj}  {k  =  1  :  N  >  8),  we 
can  write  enough  linear  homogeneous  equations  in  the  form  of  the  above  constraint,  which 
can  subsequently  be  solved  for  the  9  unknowns  in  Eij,  up  to  a  scale  factor  ambiguity 
While  the  next  stage  involves  determining  Rij  and  Uj  from  Eij,  we  show  one  way  for 

incorporating  the  knowledge  of  ty  to  compute  Rij.  _ 

To  do  this,  we  first  normalize  Eij  according  to  Eij  =  ij2ltrace{EjjEij)Eij  (we  have 
used  the  same  notation  for  Eij  before  and  after  scaling).  With  Cfc  (A:  =  1  :  3)  denoting  the 
rows  of  Eij,  and  iVk  =  Ck  x  Uj,  the  rows  ri  (z  =  1  :  3)  of  the  rotation  matrix  are  given  by 
n  =  —{(Vi+UjXU)k),  where  {i,j,k}  is  an  even  permutation  of  {1,2,3}.  Finally,  given  that 
Rij  may  not  be  a  rotation  matrix  due  to  noises  of  feature  correspondence  positions  and 
translation  vector,  the  best  approximate  rotation  matrix  is  found  as  Rij  =  UV^,  where  U 
and  V  are  the  left  and  right  matrices  in  the  singular  value  decomposition  of  Rij  =  USV 
(again,  same  notation  is  used  before  and  after  ’’singular  value  normalization.” 

2.2  Non-Linear  Solution 

Here,  we  start  with  the  projection  equation: 

Pk  ~  EkP'i  Ck  —  ^intE-[Rok\iok]- 

Each  image  point  Pk  =  {xk,ykAV  two  constraints  in  terms  of  the  3  independent 
unknowns  of  the  rotational  motion  Rok^^,  and  3  unknowns  for  each  terrain  point  P  = 
(X,  Y,  Z,  1)^.  Suppose  we  have  k  =  1  :  M  views  of  I  —  1  ■.  N  points.  With  3  unknowns 
of  each  view’s  pose  angles  (rotation  matrix)  and  3  for  each  3-D  terrain  point,  we  have 
3  *  M  -f  3  *  N  unknowns  with  2*  M  *  N  equations.  If  we  assume  the  camera  coordinate 
system  at  some  view  m  as  to  be  the  reference  frame,  we  have  R^  =  Izx.s  and  are  thus 
down  to  8*M  +  Z*N -  8  unknowns.  With  M  =  2  views,  we  need  a  minimum  of 
N  =  8  points  to  have  sufficient  equations  to  solve  for  the  12  unknowns  from  12  nonlinear 
equations;  each  in  the  form  of  the  above  constraint.  A  common  approach  is  to  apply  the 
Levenberg-Marquardt  optimization  technique,  providing  the  analytical  Jacobian  of  the 
constraint  equation  to  speed  up  the  convergence  [26]. 

^'This  ambiguity  will  be  resolved  with  knowledge  of  ty-,  as  shown  later. 

can  express  Rok  in  terms  its  3  degrees  of  freedom  in  many  ways,  including  the  Rodriguez  formula 
involving  the  axis  and  angle  of  rotation. 
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2.3  Closed-Form  Solution  with  Small- Angle  Rotations 

Again,  the  projection  =  {xk,  VkY  of  a  scene  point  P  —  (X,Y,  ZY  gives 


Pk  ^^intX^Rok\^ok\^' 


The  first  view  may  be  assumed  to  be  at  the  reference  position:  Roi  =  hxa,  and  tgi  =  0. 
Thus,  we  can  write  p^  —  MintP,  which  gives  two  constraints  on  P  =  {X,Y,ZY-  If 
mj  (i  =  1  :  3)  denotes  rows  of  Mint,  and  defining  3-D  vectors  ai.  =  xim^  -  rrii  (with 
components  0:1^;  i  =  1  :  3)  and  /?i  =  j/ims  -  m2  (with  components  Pn]  i  +  1  :  3),  it  can  be 
readily  shown  that 

X  =  kxZ  and  Y  =  kyZ, 

where 

kx  —  {oii2pi3  ~  (^131^12) /{<^nPi2  ~  “12A1) 
ky  =  {auPii  -  aiiPi3)/{aiiPi2  -  ocnPn)- 
For  the  2nd  view,  we  have 

P2  ~  int(,Ro2P  T  ^02)’ 

If  rotation  is  assumed  smalF^,  we  can  use  the  approximation  R02  =  /  +  [w],^,  where 
is  a  skew-symmetric  matrix  corresponding  to  the  rotation  vector  w:  ([w],^  x  =  u  x  x).  By 
substitution  and  some  tedious  algebraic  manipulation,  we  finally  arrive  at  two  constraint 
equations: 


[(q:23A:j,  —  Q:22)a>i  +  (<^21  —  Oi23kx)^y  +  (“22^2  ~  Ot2iky)(jJx  +  {a2ikx  +  a22ky  +  0:23)]-^  +  5o  —  0) 


[{P23ky  —  P22)^x  +  {P21  —  p23kx)^y  +  (^2^i  “  P2\ky)u)z  +  (Azi^i  +  ^22^!/  +  ^23)]^  +  56  =  0 


where 

9a  —  Oi2-  to2  9b  =  ^2-  to2, 

02  —  X2^3  ~  ^1  (with  components  a2i',  f  =  1  :  3),  /^  =  X2Tn$  —  rrii  (with  components 
P2i,  i  =  1  :  3),  and  w  =  (tUx,  Wy,  and  to2  =  {tx,ty,tzY  described  in  terms  of  their 
motion  components.  These  equations  resemble  the  differential  image  motion  constraint 
[19,  24],  written  in  terms  of  point  correspondences  Pi  =  (xi,yi)  and  P2  =  {x2,y2)- 
Eliminating  Z  leads  to  a  constraint  equation  in  terms  of  the  rotation  lj: 

iiYi3ky  -  Y22)9a  -  {^23^  -  Ol22)9b)^^x+ 

((/%!  -  p23kx)9a  -  (021  “  0:23^) 9b)  i*^y+ 

({P22kx  -  P2lky)9a  -  {oi22kx  “  a2lky)9b)  i^z+ 

{P2lkx  +  P22ky  +  P2^9a  ~  (021^1  +  Oi22ky  +  023)56  =  0. 

practice,  the  raw  image  can  be  de-rotated  based  on  some  a  prior  rough  knowledge  of  pitch,  roll  and 
yaw  angles,  e.g.,  either  from  gyros  or  an  earlier  estimate  of  the  angles  at  previous  UAV  position,  leaving 
us  with  small  correction  angles  to  be  estimated. 
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With  GPS  readings  which  gives  us  to2i  this  becomes  a  linear  constraint  in  terms  of  the 
rotational  motion  components.  Minimum  of  3  point  correspondences  enable  us  to  compute 
a  solution  in  closed  form.  A  least-square  solution  can  be  determined  from  a  redundant  set 
of  equations  with  N  >  3  points. 


2.4  Construction  of  Terrain  3-D  Points 

Once  we  know  the  motion,  we  can  write  the  constraint  =  CfcP  in  the  form 

{xkcl-cl)-T  =0 

iyk4~4)-P  =0 


where  4  {i  —  1  :  3)  denote  the  rows  of  the  camera  matrix  Ck-  With  M  >2  views,  we 
have  a  redundant  set  of  equations  to  solve  for  P: 


ylP  =  0;  A  = 


xi4  ~  ^1 
yi4  -  4 
X24  -  4 
y24  -  4 


yMfC^f  —  4f 


2Mx4 


The  up-to-scale  solution  can  be  found  from  the  eigenvector  corresponding  to  the  smallest 
eigenvalue  of  the  4x4  matrix  (A'^A).  Scale  ambiguity  is  resolved  by  finding  P  from  the 
first  3  elements  of  P/P4. 


2.5  Estimation  of  Motion  from  Known  Target  Points 

Assume  we  know  the  position  P'’  {i  —  1  :  N)  of  some  targets  on  the  ground  We  want 
to  determine  the  pose  (angular  motions  relative  to  reference  frame)  of  the  aircraft,  while 
also  utilizing  the  GPS  readings.  Again,  we  start  from  the  projection  equation: 

Pk  ~  ^ k[Rok\iok]P\ 

which  give  two  constraints  in  terms  of  only  3  independent  unknowns  in  the  rotational 
motion  Rok-  With  M  views  of  N  points,  we  have  3*M—3  unknowns  m2*M*N  equations. 
Even  with  3  points,  we  have  more  equations  than  unknowns  (6M  >  3M  -  3).  While 
the  equations  are  nonlinear,  we  can  again  apply  the  Levenberg-Marquardt  optimization 
technique,  supplying  the  analytical  Jacobian  for  faster  convergence. 


^^These  can  be  determined  from  the  earlier  solutions. 
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Figure  4:  An  experiment  at  500  [m],  showing  XY  projections  of  30  ground  features  (red  circles) 
imaged  from  5,  7,  . . 19  UAV  positions  (black  x).  25%  features  with  highest  error  variance  (blue 
square)  and  those  3  times  above  the  median  (red  dot)  are  identified,  typically  located  near  the 
boundaries  either  far  away  or  below  a  cluster  of  UAV  viewing  positions. 
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Appendix  3a:  Simulations  of  panoramic  high-resolution  imaging  system 


Results  in  this  appendix  comprise  simulations  to  test  the  performance  of  the  panoramic 
imaging  system,  by  varying  the  values  of  various  design  parameters.  These  tests  deal  with 
both  the  estimation  of  the  3-D  coordinates  of  terrain  features,  and  3-D  motion  utilizing 
3-D  feature  coordinates  calculated  at  each  UAV  position  along  its  trajectory. 
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Figure  4:  Uncertainty  in  the  estimated  x  component  of  the  terrain  feature  position  at  3  UAV 
altitudes  of  100  [m],  500  [m]  and  2  [km]  for  various  system  parameters,  including  number  of 
cameras,  camera  resolution,  and  imaging  system  radius  that  controls  the  camera  baselines. 
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Figure  5:  Uncertainty  in  the  estimated  Z  component  of  the  terrain  feature  position  at  3  UAV 
altitudes  of  100  [m],  500  [m]  and  2  [km]  for  various  system  parameters,  including  number  of 
cameras,  camera  resolution,  and  imaging  system  radius  that  controls  the  camera  baselines. 
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Figure  8:  Uncertainties  in  estimating  the  3-D  translation  components,  assuming  knowledge  of 
rotational  motion  with  an  uncertainty  of  0.1  [deg],  while  varying  the  number  and  resolution  of 
cameras. 


Idegreel  o^^t<J«greel  [degree] 


Figure  9:  Uncertainties  of  3-D  motion  parameters,  rotation  ^z]  and  translation  [tx^ty^  tz]^ 

varying  radius  of  imaging  system  structure  with  12  cameras  for  various  camera  resolutions. 
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Figure  10:  Uncertainties  of  3-D  motion  parameters,  rotation  and  translation 

[tx'>ty^tz\>t  varying  density  of  terrain  feature  points  used  in  the  computations  for  various  cam¬ 
era  resolutions. 
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Figure  11:  Uncertainties  in  the  estimation  of  3-D  motion  parameters,  rotation  [ujx^u^y^ojz]  and 
translation  with  varying  the  viewing  angle  of  12  cameras  at  various  resolutions. 
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Figure  12:  Assuming  knowledge  of  rotation  with  an  uncertainty  of  0.1  [deg],  3-D  translation 
uncertainties  are  determined  while  varying  the  viewing  angle  of  12  cameras  at  various  resolutions. 
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Appendix  3b:  Results  of  closed-form  eight-Point  algorithm  for  terrain  feature 

localization 


Results  in  this  appendix  are  for  the  three  altitudes  of  500  [m],  2000  [m]  and  4000  [m]. 
Each  page  comprise  5x4  arrays  of  plots  arranged  in  3  rows  and  2  columns.  Each  row 
corresponds  to  one  of  the  three  L  x  L  {L  -  {1,3,4})  image  resolutions.  Each  of  the 
two  columns  deal  with  one  of  N  {N  =  {15,30,45,60}  number  of  terrain  features  that  are 
used  in  the  computation  of  the  UAV  pose  from  two  views.  The  complete  set  for  all  four 
choices  for  N  are  given  on  two  subsequent  pages.  Each  of  the  5x4  arrays  correspond  to 
computations  based  on  tracking  the  features  in  M  (M  =  2  :  21)  views.  Finally,  various 
pages  contain  the  results  for  GPS  variances  aops  =  {0)  1)  2}  [m]. 
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Figure  13:  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in 
terrain  feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS 
(Altitude  500);  See  text  for  details. 
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Figure  13:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 
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Figure  13:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Figure  14:  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in 
terrain  feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS 
(altitude  of  2000  [m]). 
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Figure  14:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m] . 
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Figure  15:  Closed-Form  8-Point  Algorithm-  Uncertainty  (reconstruction  variance  [m])  in 
terrain  feature  localization  by  tracking  15  (left)  and  30  (right)  points  with  noise-free  GPS 
(altitude  of  4000  [m]). 
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Figure  15:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 
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Point  No.  Point  No.  Point  No.  Point  No. 


Figure  15:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Appendix  3c:  Results  of  closed-form  solution  with  small-angle  rotations  for 

terrain  feature  localization 


Results  in  this  appendix  are  from  simulations  for  the  three  altitudes  of  500  [m],  2000  [m] 
and  4000  [m].  Each  page  consists  of  5  x  4  arrays  of  plots  arranged  in  3  rows  and  2  columns. 
Each  row  corresponds  to  one  of  the  three  L  x  L  (L  =  {1, 3, 4})  image  resolutions.  Each  of 
the  two  columns  deal  with  one  oi  N  {N  =  {15, 30,45,60}  number  of  terrain  features  that 
are  used  in  the  computation  of  the  UAV  pose  from  two  views.  The  complete  set  for  all 
four  choices  for  N  are  given  on  two  subsequent  pages.  Each  of  the  5x4  arrays  correspond 
to  computations  based  on  tracking  the  features  in  M  (M  =  2  :  21)  views.  Finally,  various 
pages  contain  the  results  for  GPS  variances  aops  =  (0)  T  2}  [m]. 
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Var  of  XYZ-  fiitSOO  [m],  C:amRes=  1[KPix),  Npnts:15,  PixNoJse:1  [pbc],  GPSErrO  fm] 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  16:  Closed-Form  Solution  with  Small  Rotation  Approximation-  Uncertainty  (re¬ 
construction  variance  [m])  in  terrain  feature  localization  by  tracking  15  (left)  and  30  (right) 
points  with  noise- free  GPS  (Altitude  500);  See  text  for  details. 
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Var  of  XYZ-  AltSOO  [m],  CamRes=  1(KPU],  Npnts:15,  PixNoiserl  [pix],  GPSErrI  [m] 


PoInlNo.  PointNo.  Polnl  No.  Point  No. 


Var  of  XVZ-  AttSOO  (m],  CamRos=  3{KPixl,  NpntsilS,  PixNoise:!  Ipbc),  GPEErnl  (m) 


PoinfNe.  Point  No.  Point  No.  PointNo. 


Var  of  XYZ-  AltSOO  [mj,  C»mRes=  4(KPix).  Npnts:15.  Pb(No!se:1  |pix],  GPSErrI  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XY2'  AltSOO  [m],  CarnRes^  HKPix],  Npnts:30,  PixNoiserl  (pix),  GPSErrI  fm] 


PointNo.  PointNo.  PdntNo.  PointNo. 


Var  of  XY2-  AllrSOO  [m],  CamRes=  3|KPix),  Npnts:30,  PixNoiserl  {pix],  GPSErrrf  [m] 


Polnl  No.  Point  No.  Point  No.  Point  No. 


Var  of  XY2-  AllrSOO  Im],  CamRe6=  4[KRx),  Npnts:30,  PixNoiserl  [pix],  GPSErrI  fm) 


Point  No.  Point  No.  Pdnt  No.  Point  No. 


Figure  16r  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Var  of  XY2-  AJtSOO  [m],  CamRes=  IIKPix],  Npnts:60.  PteNolse:1  JpJx],  GPSErrtI  [m) 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Aft:500  [m],  CamRes^  3[KPb(),  Npnts:60.  PixNoise:!  )plx],  GPSErrl  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Alt:500  [m].  CaniRes=  4IKPix].  NpnU:60,  PixNoise:  1  jpix],  GPSErrl  Jm) 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  16:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Var  of  XYZ-  Alt:500  [m],  CamRes=  lIKPix],  Npnts:45,  PixNolse:1  Jpix],  GPSErr2  fm] 


Polnl  t>to.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AltSOO  [mj.  CamRes=  3fKPix),  Npnls:45,  PixNolse:1  [plxl,  GPSErrZ  (m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  A»t:500  [mJ,  C8mRefi=  4IKPix],  Npnts:45,  PUNolse:!  (pix],  GPSErrZ  [mJ 


P<*>tNo.  Point  No.  Point  No.  PcrfntNo. 


Var  of  XYZ-  AlfSOO  Im],  CamRes=  1[KPix].  Npnts:60,  PixNotse:!  Ipk],  6PSErr2  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  A)t:500  Im).  CamRefi=  3IKPixJ,  Npnts:60,  PixNolse:!  fpU),  GPSErn2  Im) 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AK:500  [m],  CamRes^  4[KPlxl,  Npnts:60.  PixNolse:  1  |plx],  GPSEm2  |m) 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  16:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  2  [m]. 
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Var  of  XYZ-  A«:20O0  Im),  CamReK=  lIKPixJ,  Npnts:15.  PixNoke:1/3  Iplx],  GPSErriO  |m) 


Pant  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Alt2000  [m],  CamRess  3IKPix},  Npnts:15,  Pfa(Noise:1/3  |pix].  GPSEmO  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Aft;2000  |m].  CamRes=  4IKPix}.  Npnts:15,  PjxNoise:1/3  [plx],  GPSErrO  Im] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  A«:2000  Im],  CamRes=  1(KPk1.  Npnls:30,  PixNolse:1/3  Iplx],  GPSErrO  (m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AttZDOO  (m],  CamRBS=  3{KPix],  Npnts:30,  PixNoise:l/3  [plx],  GPSErriO  |m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AltZOOO  fm],  CamRes=  4IKPU),  Npnts:30,  PbcNois«:1/3  (plx),  GPSErrO  M 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  17:  Closed-Form  Solution  with  Small  Rotation  Approximation-  Uncertainty  (re¬ 
construction  variance  [m])  in  terrain  feature  localization  by  tracking  15  (left)  and  30  (right) 
points  with  noise-free  GPS  (Altitude  2000). 
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Var  of  XYZ-  All;2000  [m],  CamRes=  1(KPut].  Npnls:45,  PixNoise:1/3  Iplx].  GPSEmO  [mj 


Point  No.  Pwnt  No.  Polnl  No.  Point  No. 


Pohit  No.  Pok>t  No.  Point  No.  Point  No. 

Var  of  XYZ-  AttZOOO  [m],  CamRes=  4|KPixJ.  Npots:45.  PixNolse:1/3  Iph],  GPSEnrO  fm] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AltiZOOO  [mJ,  CamRes=  flKPix],  NpnU5:60.  PixNolserl/S  (pbr),  GPSErrO  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AltZOOO  [m}.  CamR8s=  4[KPixI,  Npnts:60,  PlxNolse:1/3  {pix].  GPSErrO  Im] 


Polnl  No.  Point  No.  Point  No.  Point  No. 


Figure  17:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 
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Figure  17:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Var  of  XYZ-  AH:4000  (m],  CamRes=  1(KPU].  Npnts;15,  PkNoise:1/3  lp«],  GPSErrO  (m] 


ol _ : - : - r  o' - ^ ^ ^  o' - ^ ^ ^  o' - ^ - ' 

0  5  10  15  0  5  10  15  0  5  10  15  0  5  10  15 


Point  No.  Point  No.  P<*rtNo.  Point  No. 


Var  of  XYZ-  AK;4000  |m],  CamRes*  SIKPixJ,  NpntsrlS,  PjxNolse:l/3  (pfat),  GPSErrO  |tT»] 


Pofrit  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AK:4000  [mj.  CafnReB=  4IKPixl,  NpntsrlS,  PkNoise:1/3  Jpb:].  GPSEnr.O  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  AlWODO  [mj,  CamRes>=  1{KPix],  Npnts:30,  RxNolserlO  Ipbc],  GPSErrO  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Alt:4000  ImJ,  CamRes=  3}KPix].  Npnts:30,  PixNolse:1/3  {pix],  GPSErrO  (m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Art;4000  |m].  CamRcs=  4tKPlx],  Npnts:30,  PixNoise:1/3  Ipk],  GPSErrO  fm] 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  18:  Closed-Form  Solution  with  Small  Rotation  Approximation-  Uncertainty  (re¬ 
construction  variance  [m])  in  terrain  feature  localization  by  tracking  15  (left)  and  30  (right) 
points  with  noise-free  GPS  (Altitude  4000). 
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Var  of  XYZ-  Alt:4000  [m],  CamRes=  lIKPtx],  Npnls:45,  HxNotse:1/3  Ipbc],  GPSEirO jm] 


Poinl  No.  Point  No,  Point  No.  Point  No. 


Var  of  XYZ-  /Ut:4DOO  [m),  CamRes=  SJKPixJ,  Npnls;45.  PixNoise:1/3  |p»kJ,  GPSErrO  (m) 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Att;4000  (ml.  CamRes*  4IKPix].  Npnts:45.  PixNoise:1/3  (pix],  GPSErrO  (m] 


PcrintNo.  Point  No.  Point  No.  Point  No, 


Var  of  XYZ-  Alt;4000  (ml,  CamRes=  tlKPixl,  NpntsTBO,  Pb(Noiso:1/3  (pbc],  GPSErrO  |m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Atl:4000  [m],  CamRes=  3[KPixl,  Npnts:60.  PixNoise:1/3  [frix],  GPSErrO  (m) 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Aft:4000  [m],  CamRes=  4IKPixl,  Npnts:60,  PixNoise:l/3  [pix],  GPSErrO  (m) 


Point  No.  Point  No.  Point  No.  Point  No. 


Figure  18:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  noise-free  GPS. 
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Var  orXYZ-M:4000  {m],  CamRes=  1[KPixl,  Npnts;45.  PixNolse:1/3  Ipbcj.  GPSErrI  [m] 


Point  No.  Point  No.  Point  No.  Poini  No. 


Var  of  XYZ-  AK:4000  [m],  CamRes=  3[KPixl,  NpnU:45,  PlxNoise:1/3  fpbcj,  GPSEmI  Im^ 


Var  of  XY2-  Alt;4000  [mj,  CamRes®  4|KPix],  Npnts:45.  PtxNoise:1/3  Ipix],  GPSEtrl  [m] 


Point  No.  Point  No.  Point  No.  Point  No. 


Var  of  XYZ-  Ah;4000  [m],  CBmRes=  lIKPix],  Npnts:60.  PixMolse:1/3  Jplxl,  GPSErrI  (m] 
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Figure  18:  (continued)-  Tracking  45  (left)  and  60  (right)  points  with  GPS  error  variance 
of  1  [m]. 
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Var«if  XY2-  A!t:4000  (m],  CamRes®  1(KPixl,  Npnts:15,  PixNolse:1/3  [pix],  GPSEfr2  [mj 


Var  of  XYZ-  Att4000  [m],  CarnRes®  1[KPjxJ.  Npnts:30,  PixNoise:!^  Jplx],  GPSErr2  [m] 
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Figure  18;  (continued)-  Tracking  15  (left)  and  30  (right)  points  with  GPS  error  variance 
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Appendix  3d:  Selected  results  from  nonlinear  iterative  solution  for  terrain 

feature  localization 


Each  page  in  this  section  consists  of  six-  arranged  in  3  rows  and  2  columns"-  3x3  arrays 
of  plots.  Each  row  with  2  such  arrays  corresponds  to  oiu'  of  three  L  x  L  {L  —  {1,3,4}) 
image  resolutions.  Each  column  of  3  such  arrays  comprises  the  results  for  one  case  of  N 
(yV  {3,6,9, 12})  terrain  features  being  tracked  (in  M  =  {2  :  15}  frames)  to  determine 
the  3-D  coordinates  of  landmark  terrain  features.  The  complete  set  for  all  four  values  of 
N  arc  given  on  two  subsequent  pages.  Each  3x3  array  corresponds  to  error  variances 
of  X,  Y  and  Z  (coordinates  for  a  sample  of  3  oat  of  N  teriura  fc.a.fures  (compkete  set  is 
given  in  Appendix  3e).  Each  plot  has  three  curves  (R,G,  and  B  colors)  for  GPS  variances 
a(;ps  =  {0, 1,2}  [m],  respectively.  Finally,  various  pages  contain  the  results  for  3  altitudes 
of  500  [m],  2000  [m],  and  4000  [m]. 
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Figure  19:  Nonlinear  Iterative  Solution  -  Uncertainty  (reconstruction  variance  (in))  of  3 
sample  points  in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in 
images  with  LxL{L  =  {1,3,4})  resolutions  (Altitude  500);  See  text  for  details. 
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Figure  19;  (continued)-  Tracking  9  (left)  and  12  (right)  points. 
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Figure  20:  Nonlinear  Iterative  Solution-  Uncertainty  (reconstruction  variance  |m])  of  3 
sample  points  in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in 
images  with  L  x  L  {L  =  {1,3,4})  resolutions  (Altitude  2000). 
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(R),  stgX.Ygltml  5l9m2«10[m]  (G),  sl9X.Ye2{fnj  stgZ=20[nn}  (B)-  All:2000  [m],  Res:  1tkJ.  PixNoise:1/3  GPS  slgX.Y.^O  (R),  sigX,Ys1[ni]  sigmZ=10fm]  (G).  sigX,Y»2tml  8lgZ=20[fn)  (B)-  A)t;2000  [m],  Res:  1[K],  f 


Figure  20:  (continued)-  Tracking  9  (left)  and  12  (right)  points. 
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Figure  21:  Nonlinear  Iterative  Solution  Uncertainty  (reconstruction  variance  |m])  of  3 
sample  points  in  terrain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in 
images  with  L  x  L  {L  =  {1,3,4})  resolutions  (Altitude  4000). 
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Figure  21:  (continued)-  Tracking  9  (left)  and  12  (right)  points. 
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Appendix  3e:  Complete  results  from  nonlinear  iterative  solution  for  terrain 

feature  localization 


These  figures  correspond  to  the  same  simulations  as  in  the  previous  appendix,  however, 
th(’y  comprises  rc'snlls  for  all  of  the  terrain  feature'  points. 

Each  page  in  this  section  consists  of  six-  arranged  in  3  rows  and  2  columns  3x3  arrays 
of  plots.  Each  row  with  2  such  arrays  corresponds  to  one  of  three  L  x  L  {L  —  {1,3,4}) 
image  resolutions.  Each  column  of  3  such  arrays  comprises  the  results  for  one  case  of  N 
(iV  =  {3,6,9,12})  terrain  features  being  tracked  (in  M  =  {2  :  15}  frames)  to  determine 
the  3-D  coordinate's  of  landmark  terrain  features.  The  comph'.te  set  for  all  four  values  of  N 
are  given  on  two  sul)sec|uent  pages.  Each  of  the  3x3  arrays  correspond  to  error  variances 
of  X,  Y  and  Z  coordinates  of  all  N  terrain  features.  Each  plot  has  three  curves  (R,G, 
and  B  colors)  for  GPS  variances  ctgps  =  {0, 1,2}  [?n],  respectively.  Finally,  various  pages 
contain  the  rovsults  for  3  altitudes  of  500  [m],  2000  [m],  and  4000  [m]. 


98 


GPS  slgX.Y.2=0  {R).  sigX,Ys1(ml  8lgmZ=10tnnl  (G).  slgX,Y=2[ml  sigZ=201ml  (B)-  Alt:500  (m),  Res;  1[kJ.  PixNo(se:1 


No.  frames  No.  frames  No.  frames 


GPS  sIgX.Y.&O  {R),  sigX.Y^Ilm)  8lgmZ=10Iml  (G),  8lgX,Y=2|m]  slgZ=20[ml  (B)-  All;500  [m].  Res;  3[kl,  PixNolse;! 


No.  frames  No.  frames  No.  frames 


GPS  5igX,Y,Z=0  (R),  stgX,Y*1[mI  sigmZ=10[m|  (G),  8igX.Y=2[ml  slgZ=20[m|  (B)-  All;500  (m].  Res;  IM,  PixNolse;1 


No.  frames  No,  frames  No.  frames 


Figure  22:  Nonlinear  Iterative  Solution  Uncertainty  (reconstruction  variance  [in])  in  ter¬ 
rain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  L  x  L 
{L  =  {1,3,4})  resolutions  (Altitude  500);  See  text  for  details. 
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Figure  23:  Nonlinear  Iterative  Solution  Uncertainty  (reconstruction  variance  (in|)  in  ter¬ 
rain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  L  x  L 
(L  =  {1,3,4})  resolutions  (Altitude  200);  See  text  for  details. 
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GPS  8JgX.Y,&0  (R).  8tgX,Y=1[mI  slgmZ«10[nil  (G).  stgX,Y=2Jm]  sig2=20[m]  (B)-  Alt;4000  [m].  Res-  Ilk),  PixNo<se-1/3  GPS  8igX.Y.&0  (R),  8lgX,Y*1  [m)  sigmZ»10(m)  (G),  slgX,Y-2|m]  8igZ:i20[ni|  (B)-  AJt:4000  [m].  Res:  1{k].  PixNolse:1/3 
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Figure  24:  Nonlinear  Iterative  Solution  Uncertainty  (reconstruction  variance  |m])  in  ter¬ 
rain  feature  localization  by  tracking  3  (left)  and  6  (right)  points  in  images  with  L  x  L 
{L  =  {1,3,4})  resolutions  (Altitude  500);  See  text  for  details. 
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Figure  24:  (continued)-  Tracking  9  (left)  and  12  (right)  points 


Appendix  3f:  Results  of  nonlinear  iterative  solution  for  UAV  pose  estimation 

This  section  consists  of  three  2x3  arrays  of  plots.  Defining  the  pose  as  a  rotation 
of  the  coordinate  system  with  respect  to  the  reference  frame,  the  rows  correspond  to  the 
variances  of  the  x  -  y  (pitch  and  roll)  and  2:  (heading)  components  of  the  rotation  vector, 
as  determined  from  the  Rodriguez  formula.  The  three  columns  arc  for  the  3  camera 
resolutions.  Three  curves  in  each  plot  are  for  GPS  variances  <Jgps  —  {0, 1,2}  |ni|.  Finally, 
Each  array  is  the  result  for  one  of  the  3  altitudes-  500  [m] ,  2000  [m] ,  and  4000  [m] . 
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0  10  20  30  0  10  20  30  0  10  20  30 

No  of  Tracked  Pnls  No  of  Tracked  Pnts  No  of  Tracked  Pnts 


No  of  Tracked  Pnts  Noof  Tracked  Pnts  No  of  Tracked  Pnls 


GPS  Sig:  0[ml  (R);  Ifm]  (G);  2[ml  (B)  --  A«:2  [km) 


0  10  20  30  0  10  20  30  0  10  20  30 

No  of  Tracked  Pnts  No  of  Tracked  Pnts  No  of  Tracked  Pnls 


No  of  Tracked  Pnts  No  of  Tracked  Pnts  No  of  Tracked  Pnts 


GPS  Sig:  Ofm)  (R);  llm]  (G);  2[ml  (B)  —  Alt:4  (km) 


No  of  Tracked  Pnts  No  of  Tracked  Pnts  No  of  Tracked  Pnts 


No  of  Tracked  Pnts  No  of  Tracked  Pnts  No  of  Tracked  Pnts 

Figure  25:  Variances  of  UAV  pose  angles-  computed  from  rotation  with  respect  to  ref. 
coordinate  system  using  Rodriguez  formula  for  various  altitudes,  GPS  measurement  un¬ 


certainties  (R,G,B),  and  number  of  terrain  feature  points  tracked  in  two  views.  See  text 


for  details. 
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